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NOVEL PLANT ACYLTRANSFERASES 



5 

INTRODUCTION 

This application claims the benefit of U.S. Provisional Application Serial No. 
60/101 ,939 filed September 25, 1998. 

10 

Technical Field 

The present invention is directed to nucleic acid and amino acid sequences and 
constructs, and methods related thereto. 

15 Background 

Through the development of plant genetic engineering techniques, it is now possible to 
produce transgenic varieties of plant species to provide plants which have novel and desirable 
characteristics. For example, it is now possible to genetically engineer plants for tolerance to 
environmental stresses, such as resistance to pathogens and tolerance to herbicides and to 

2 0 improve the quality characteristics of the plant, for example improved fatty acid compositions. 

However, the number of useful nucleotide sequences for the engineering of such 
characteristics is thus far limited and the speed with which new useful nucleotide sequences 
for engineering new characteristics is slow. 

The characterization of various acyltransferase proteins is useful for the further study 
25 of plant fatty acid synthesis systems and for the development of novel and/or alternative oils 
sources. Studies of plant mechanisms may provide means to further enhance, control, 
modify, or otherwise alter the total fatty acyl composition of triglycerides and oils. 
Furthermore, the elucidation of the factor(s) critical to the natural production of fatty acids in 
plants is desired, including the purification of such factors and the characterization of 

3 0 element(s) and/or cofactors which enhance the efficiency of the system. Of particular interest 

are the nucleic acid sequences of genes encoding proteins which may be useful for 
applications in genetic engineering. 
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SUMMARY OF THE INVENTION 

The present invention provides nucleic acid encoding for amino acid 
sequences for a class of proteins which are related to acyltransferase proteins. Such proteins 
are referred to herein as acyltransferase related or acyltransferase like proteins. 

1 By this invention, nucleic acid sequences encoding these acyltransferase related 
proteins may now be characterized with respect to enzyme activity. In particular, 
identification and isolation of nucleic acid sequences encoding for acyltransferase related 
proteins from Arabidopsis, yeast, corn, and soybean are provided. 

Thus, this invention encompasses acyltransferase related nucleic acid sequences and 
the corresponding amino acid sequences, and the use of these nucleic acid sequences in the 
preparation of oligonucleotides containing such acyltransferase related encoding sequences 
for analysis and recovery of plant acyltransferase related gene sequences. The acyltransferase 
related encoding sequence may encode a complete or partial sequence depending upon the 
intended use. All or a portion of the genomic sequence, or cDNA sequence, is intended. 

Of special interest are recombinant DNA constructs which provide for transcription or 
transcription and translation (expression) of the acyltransferase related sequences in host 
cells. In particular, constructs which are capable of transcription or transcription and 
translation in plant host cells are preferred. For some applications a reduction in sequences 
encoding acyltransferase related sequences may be desired. Thus, recombinant constructs 
may be designed having the acyltransferase related sequences in a reverse orientation for 
expression of an anti-sense sequence or use of co-suppression, also known as "transwitch", 
constructs may be useful. Such constructs may contain a variety of regulatory regions 
including transcriptional initiation regions obtained from genes preferentially expressed in 
plant seed tissue. For some uses, it may be desired to use the transcriptional and translational 
initiation regions of the acyltransferase related gene either with the acyltransferase related 
encoding sequence or to direct the transcription and translation of a heterologous sequence. 

Also considered in this invention are the plants and seeds containing the constructs 
and polynucleotides of this invention. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 provides the 204 amino acid conserved sequence profile identified from 
5 comparisons of glycerol-3-phosphate acyltransferase and various lysophosphatidic acid 
acyltransferase using PSI-BLAST. 

Figure 2 provides an amino acid sequence alignment for the acyltransferase 
sequences. The alignment shown is of the regions of the protein extending from about 30 
amino acids prior to the conserved H in the conserved sequence HXXXXD to 100 amino 
10 acids after, or downstream, of the P in the conserved PEG sequence motif of the 
acyltransferase-like sequences. 

Figure 3 provides schematics showing the relationship of the identified 
acyltransferases. The relationships described are derived from an alignment of the regions of 
the protein extending from about 30 amino acids prior to the conserved H in the conserved 
15 sequence HXXXXD to 100 amino acids after, or downstream, of the P in the conserved PEG 
sequence motif of the acyltransferase-like sequences. Figure 3A provide aphylogenetic tree 
showing the relationship of several acyltransferases. Figure 3B provides a table showing the 
percent similarities and percent divergence of the novel acyltransferases and known 
acyltransferases using the Clustal method with PAM250 residue weight table. 

20 

DETAILED DESCRIPTION OF THE INVENTION 

In accordance with the subject invention, nucleotide sequences are provided which are 

2 5 capable of coding sequences of amino acids, such as, a protein, polypeptide or peptide, which 

are related to nucleic acid sequences encoding acyltransferase proteins, referred to herein as 
acyltransferase-like or acyltransferase related. The novel nucleic acid sequences find use in 
the preparation of constructs to direct their expression in a host cell. Furthermore, the novel 
nucleic acid sequences may find use in the preparation of plant expression constructs to 

3 0 modify the fatty acid composition of a plant cell. 

In one embodiment of the present invention, nucleic acid sequences, also referred to 
herein as polynucleotides, are identified from databases which are related to acyltransferases. 
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Isolated proteins, Polypeptides and Polynucleotides 

A first aspect of the present invention relates to isolated acyltransferase 
polynucleotides. The polynucleotide sequences of the present invention include isolated 
5 polynucleotides that encode the polypeptides of the invention having a deduced amino acid 
sequence selected from the group of sequences set forth in the Sequence Listing and to other 
polynucleotide sequences closely related to such sequences and variants thereof. 

The invention provides a polynucleotide sequence identical over its entire length to 
each coding sequence as set forth in the Sequence Listing. The invention also provides the 
1 0 coding sequence for the mature polypeptide or a fragment thereof, as well as the coding 
sequence for the mature polypeptide or a fragment thereof in a reading frame with other 
coding sequences, such as those encoding a leader or secretory sequence, a pre-, pro-, or 
prepro- protein sequence. The polynucleotide can also include non-coding sequences, 
including for example, but not limited to, non-coding 5' and 3' sequences, such as the 
15 transcribed, untranslated sequences, termination signals, ribosome binding sites, sequences 
that stabilize mRNA, introns, polyadenylation signals, and additional coding sequence that 
encodes additional amino acids. For example, a marker sequence can be included to facilitate 
the purification of the fused polypeptide. Polynucleotides of the present invention also 
include polynucleotides comprising a structural gene and the naturally associated sequences 
20 that control gene expression. 

The invention also includes polynucleotides of the formula: 
X-(R0n-(R 2 )-(R3)n-Y 

wherein, at the 5' end, X is hydrogen, and at the 3' end, Y is hydrogen or a metal, R and R 3 
are any nucleic acid residue, n is an integer between 1 and 3000, preferably between 1 and 

2 5 1000 and R 2 is a nucleic acid sequence of the invention, particularly a nucleic acid sequence 

selected from the group set forth in the Sequence Listing and preferably SEQ IDNOs: 1, 3, 5, 
7, 9, 10, 12, 14, 16, 18, 20, 22, and 226-233. In the formula, R 2 is oriented so that its 5' end 
residue is at the left, bound to R|, and its 3' end residue is at the right, bound to R*. Any 
stretch of nucleic acid residues denoted by either R group, where R is greater than 1, may be 

3 0 either a heteropoly mer or a homopolymer, preferably a heteropolymer. 

The invention also relates to variants of the polynucleotides described herein that 
encode for variants of the polypeptides of the invention. Variants that are fragments of the 
polynucleotides of the invention can be used to synthesize full-length polynucleotides of the 
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invention. Preferred embodiments are polynucleotides encoding polypeptide variants wherein 
5 to 10, 1 to 5, 1 to 3, 2, 1 or no amino acid residues of a polypeptide sequence of the 
invention are substituted, added or deleted, in any combination. Particularly preferred are 
substitutions, additions, and deletions that are silent such that they do not alter the properties 
5 or activities of the polynucleotide or polypeptide. 

Nucleotide sequences encoding acyltransferases may be obtained from natural sources 
or be partially or wholly artificially synthesized. They may directly correspond to an 
acyltransferase endogenous to a natural source or contain modified amino acid sequences, 
such as sequences which have been mutated, truncated, increased or the like. Acyltransferases 

10 may be obtained by a variety of methods, including but not limited to, partial or homogenous 
purification of protein extracts, protein modeling, nucleic acid probes, antibody preparations 
and sequence comparisons. Typically an acyltransferase will be derived in whole or in part 
from a natural source. A natural source includes, but is not limited to, prokaryotic and 
eukaryotic sources, including, bacteria, yeasts, plants, including algae, and the like. 

15 Of special interest are acyltransferases which are obtainable from eukaryotic sources, 

including those which are obtained, from plants, or from acyltransferases which are 
obtainable through the use of these sequences. "Obtainable" refers to those acyltransferases 
which have sufficiently similar sequences to that of the sequences provided herein to provide 
a biologically active protein of the present invention. 

2 0 Further preferred embodiments of the invention that are at least 50%, 60%, or 70% 

identical over their entire length to a polynucleotide encoding a polypeptide of the invention, 
and polynucleotides that are complementary to such polynucleotides. More preferable are 
polynucleotides that comprise a region that is at least 80% identical over its entire length to a 
polynucleotide encoding a polypeptide of the invention and polynucleotides that are 
25 complementary thereto. In this regard, polynucleotides at least 90% identical over their entire 
length are particularly preferred, those at least 95% identical are especially preferred. Further, 
those with at least 97% identity are highly preferred and those with at least 98% and 99% 
identity are particularly highly preferred, with those at least 99% being the most highly 
preferred. 

3 0 Preferred embodiments are polynucleotides that encode polypeptides that retain 

substantially the same biological function or activity as the mature polypeptides encoded by 
the polynucleotides set forth in the Sequence Listing. 



WO 00/18889 PCT/US99/22231 

6 

The invention further relates to polynucleotides that hybridize to the above-described 
sequences. In particular, the invention relates to polynucleotides that hybridize under 
stringent conditions to the above-described polynucleotides. As used herein, the terms 
"stringent conditions" and "stringent hybridization conditions" mean that hybridization will 
generally occur if there is at least 95% and preferably at least 97% identity between the 
sequences. An example of stringent hybridization conditions is overnight incubation at 42°C 
in a solution comprising 50% formamide, 5x SSC (150 mM NaCl, 15 mM trisodium citrate), 
50 mM sodium phosphate (pH 7.6), 5x Denhardt's solution, 10% dextran sulfate, and 20 
micrograms/milliliter denatured, sheared salmon sperm DNA, followed by washing the 
hybridization support in 0.1 x SSC at approximately 65°C. Other hybridization and wash 
conditions are well known and are exemplified in Sambrook, et ai, Molecular Cloning: A 
Laboratory Manual, Second Edition, cold Spring Harbor, NY (1989), particularly Chapter 11. 

The invention also provides a polynucleotide consisting essentially of a 
polynucleotide sequence obtainable by screening an appropriate library containing the 
complete gene for a polynucleotide sequence set for in the Sequence Listing under stringent 
hybridization conditions with a probe having the sequence of said polynucleotide sequence or 
a fragment thereof; and isolating said polynucleotide sequence. Fragments useful for 
obtaining such a polynucleotide include, for example, probes and primers as described herein. 

As discussed herein regarding polynucleotide assays of the invention, for example, 
polynucleotides of the invention can be used as a hybridization probe for RNA, cDNA, or 
genomic DNA to isolate full length cDNAs or genomic clones encoding a polypeptide and to 
isolate cDNA or genomic clones of other genes that have a high sequence similarity to a 
polynucleotide set forth in the Sequence Listing. Such probes will generally comprise at least 
15 bases. Preferably such probes will have at least 30 bases and can have at least 50 bases. 
Particularly preferred probes will have between 30 bases and 50 bases, inclusive. 

The coding region of each gene that comprises or is comprised by a polynucleotide 
sequence set forth in the Sequence Listing may be isolated by screening using a DNA 
sequence provided in the Sequence Listing to synthesize an oligonucleotide probe. A labeled 
oligonucleotide having a sequence complementary to that of a gene of the invention is then 
used to screen a library of cDNA, genomic DNA or mRNA to identify members of the library 
which hybridize to the probe. For example, synthetic oligonucleotides are prepared which 
correspond to the N-terminal sequence of the polypeptide. The partial sequences so prepared 
can then be used as probes to obtain acyltransferase clones from a gene library prepared from 
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a cell source of interest. Alternatively, where oligonucleotides of low degeneracy can be 
prepared from particular peptides, such probes may be used directly to screen gene libraries 
for gene sequences. In particular, screening of cDNA libraries in phage vectors is useful in 
such methods due to lower levels of background hybridization. 
5 Typically, a sequence obtainable from the use of nucleic acid probes will show 60- 

70% sequence identity between the target acyltransferase sequence and the encoding sequence 
used as a probe. However, lengthy sequences with as little as 50-60% sequence identity may 
also be obtained. The nucleic acid probes may be a lengthy fragment of the nucleic acid 
sequence, or may also be a shorter, oligonucleotide probe. When longer nucleic acid 

10 fragments are employed as probes (greater than about 100 bp), one may screen at lower 
stringencies in order to obtain sequences from the target sample which have 20-50% 
deviation (i.e., 50-80% sequence homology) from the sequences used as probe. 
Oligonucleotide probes can be considerably shorter than the entire nucleic acid sequence 
encoding an acyltransferase enzyme, but should be at least about 10, preferably at least about 

15 15, and more preferably at least about 20 nucleotides. A higher degree of sequence identity is 
desired when shorter regions are used as opposed to longer regions. It may thus be desirable 
to identify regions of highly conserved amino acid sequence to design oligonucleotide probes 
for detecting and recovering other related genes. Shorter probes are often particularly useful 
for polymerase chain reactions (PCR), especially when highly conserved sequences can be 

20 identified. (See, Gould, etah y PNAS USA (1989) £6:1934-1938). 

The skilled artisan will appreciate that, in many cases, an isolated cDNA sequence 
will be incomplete, in that the region coding for the polypeptide is truncated with respect to 
the 5' terminus of the cDNA. This is a consequence of the reverse transcriptase, an enzyme 
with low 'processivity' (a measure of the ability of the enzyme to remain attached to the 

2 5 template during the polymerization reaction) employed during the first strand cDN A 

synthesis. 

There are several methods available and are well know to the skilled artisan to obtain 
full-length cDN As, or extend short cDNAs, for example those based on the method of Rapid 
Amplification of cDNA Ends (RACE) (see, for example, Frohman et ai (1988) Proc. Natl 

3 0 Acad. ScL USA 85:8998-9002). Recent modifications of the technique, exemplified by the 

Marathon™ technology (Clonetech Laboratories, Inc.) for example, have significantly 
simplified obtaining full-length cDNA sequences. 
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Another aspect of the present invention relates to isolated acyltransferase 
polypeptides. Such polypeptides include isolated polypeptides set forth in the Sequence 
Listing, as well as polypeptides and fragments thereof, particularly those polypeptides which 
exhibit acyltransferase activity and also those polypeptides which have at least 50%, 60% or 
5 70% identity, preferably at least 80% identity, more preferably at least 90% identity, and most 
preferably at least 95% identity to a polypeptide sequence selected from the group of 
sequences set forth in the Sequence Listing, and also include portions of such polypeptides, 
wherein such portion of the polypeptide preferably includes at least 30 amino acids and more 
preferably includes at least 50 amino acids. 

10 "Identity", as is well understood in the art, is a relationship between two or more 

polypeptide sequences or two or more polynucleotide sequences, as determined by comparing 
the sequences. In the art, "identity" also means the degree of sequence relatedness between 
polypeptide or polynucleotide sequences, as determined by the match between strings of such 
sequences. "Identity" can be readily calculated by known methods including, but not limited 

15 to, those described in Computational Molecular Biology, Lesk, A.M., ed., Oxford University 
Press, New York (1988); Biocomputing: Informatics and Genome Projects, Smith, D.W., ed., 
Academic Press, New York, 1993; Computer Analysis of Sequence Data, Part I, Griffin, 
A.M. and Griffin, H.G., eds., Humana Press, New Jersey (1994); Sequence Analysis in 
Molecular Biology, von Heinje, G., Academic Press (1987); Sequence Analysis Primer, 

20 Gribskov, M. and Devereux, J., eds., Stockton Press, New York (1991); and Carillo, H., and 
Lipman, D., SIAM J Applied Math, 48: 1073 (1988). Methods to determine identity are 
designed to give the largest match between the sequences tested. Moreover, methods to 
determine identity are codified in publicly available programs. Computer programs which 
can be used to determine identity between two sequences include, but are not limited to, GCG 

25 (Devereux, L, et al., Nucleic Acids Research 12(1):387 (1984); suite of five BLAST 
programs, three designed for nucleotide sequences queries (BLASTN, BLASTX, and 
TBLASTX) and two designed for protein sequence queries (BLASTP and TBLASTN) 
(Coulson, Trends in Biotechnology, 12: 76-80 (1994); Birren, et ai, Genome Analysis, 1: 
543-559 (1997)). The BLAST X program is publicly available from NCBI and other sources 

3 0 (BLAST Manual, Altschul, S., et ai, NCBI NLM NIH, Bethesda, MD 20894; Altschul, S., et 
al, 7. Mol Biol., 215:403-410 (1990)). The well known Smith Waterman algorithm can also 
be used to determine identity. 
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Parameters for polypeptide sequence comparison typically include the following: 
Algorithm: Needleman and Wunsch, / Mol Biol 48:443-453 (1970) 
Comparison matrix: BLOSSUM62 from Hentikoff and Hentikoff, Proc. Natl. Acad. 
Sci USA 89:10915-10919 (1992) 
5 Gap Penalty: 12 

Gap Length Penalty: 4 

A program which can be used with these parameters is publicly available as the "gap" 
program from Genetics Computer Group, Madison Wisconsin. The above parameters along 
with no penalty for end gap are the default parameters for peptide comparisons. 
1 0 Parameters for polynucleotide sequence comparison include the following: 

Algorithm: Needleman and Wunsch, J. Mol. Biol. 48:443-453 (1970) 

Comparison matrix: matches = +10; mismatches = 0 

Gap Penalty: 50 

Gap Length Penalty: 3 

15 A program which can be used with these parameters is publicly available as the "gap" 

program from Genetics Computer Group, Madison Wisconsin. The above parameters are the 
default parameters for nucleic acid comparisons. 

The invention also includes polypeptides of the formula: 
X-(R») n -(R 2 MR 3 )„-Y 

20 wherein, at the amino terminus, X is hydrogen, and at the carboxyl terminus, Y is hydrogen or 
a metal, R\ and R 3 are any amino acid residue, n is an integer between 1 and 1000, and Rg is 
an amino acid sequence of the invention, particularly an amino acid sequence selected from 
the group set forth in the Sequence Listing and preferably SEQ IDNOs: 2, 4, 6, 8, 1 1, 13, 15, 
17, 19, 21, 23, and 218-225. In the formula, R 2 is oriented so that its amino terminal residue 

25 is at the left, bound to R|, and its carboxy terminal residue is at the right, bound to R3. Any 
stretch of amino acid residues denoted by either R group, where R is greater than 1, may be 
either a heteropolymer or a homopolymer, preferably aheteropolymer. 

Polypeptides of the present invention include isolated polypeptides encoded by a 
polynucleotide comprising a sequence selected from the group of a sequence contained in 

30 SEQ ID NOs: 1, 3, 5, 7, 9, 10, 12, 14, 16, 18, 20, 22, and 226-233. 

The polypeptides of the present invention can be mature protein or can be part of a 
fusion protein. 
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Fragments and variants of the polypeptides are also considered to be a part of the 
invention. A fragment is a variant polypeptide which has an amino acid sequence that is 
entirely the same as part but not all of the amino acid sequence of the previously described 
polypeptides. The fragments can be "free-standing" or comprised within a larger polypeptide 
5 of which the fragment forms a part or a region, most preferably as a single continuous region. 
Preferred fragments are biologically active fragments which are those fragments that mediate 
activities of the polypeptides of the invention, including those with similar activity or 
improved activity or with a decreased activity. Also included are those fragments that 
antigenic or immunogenic in an animal, particularly a human. 

1 0 Variants of the polypeptide also include polypeptides that vary from the sequences set 

forth in the Sequence Listing by conservative amino acid substitutions, substitution of a 
residue by another with like characteristics. In general, such substitutions are among Ala, 
Val, Leu and He; between Ser and Thr; between Asp and Glu; between Asn and Gin; between 
Lys and Arg; or between Phe and Tyr. Particularly preferred are variants in which 5 to 10; 1 

15 to 5; 1 to 3 or one amino acid(s) are substituted, deleted, or added, in any combination. 

Variants that are fragments of the polypeptides of the invention can be used to 
produce the corresponding full length polypeptide by peptide synthesis. Therefore, these 
variants can be used as intermediates for producing the full-length polypeptides of the 
invention. 

2 0 The polynucleotides and polypeptides of the invention can be used, for example, in 

the transformation of various host cells, as further discussed herein. 

The invention also provides polynucleotides that encode a polypeptide that is a mature 
protein plus additional amino or carboxyl-terminal amino acids, or amino acids within the 
mature polypeptide (for example, when the mature form of the protein has more than one 

2 5 polypeptide chain). Such sequences can, for example, play a role in the processing of a 

protein from a precursor to a mature form, allow protein transport, shorten or lengthen protein 
half-life, or facilitate manipulation of the protein in assays or production. It is contemplated 
that cellular enzymes can be used to remove any additional amino acids from the mature 
protein. 

30 A precursor protein, having the mature form of the polypeptide fused to one or more 

prosequences may be an inactive form of the polypeptide. The inactive precursors generally 
are activated when the prosequences are removed. Some or all of the prosequences may be 
removed prior to activation. Such precursor protein are generally called proproteins. 
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The polynucleotide and polypeptide sequences can also be used to identify additional 
sequences which are homologous to the sequences of the present invention. The most 
preferable and convenient method is to store the sequence in a computer readable medium, 
for example, floppy disk, CD ROM, hard disk drives, external disk drives and DVD, and then 
5 to use the stored sequence to search a sequence database with well known searching tools. 
Examples of public databases include the DNA Database of Japan 
(DDBJ)(http://www.ddbj.nig.ac.jp/);Genebank 

( http://www.ncbi.nlm.nih.gov/web/Genbank/Index.htmlV, and the European Molecular 
Biology Laboratory Nucleic Acid Sequence Database (EMBL) 

1 0 (http://www.ebi.ac.uk/ebi docs/embl db.html) . A number of different search algorithms are 
available to the skilled artisan, one example of which are the suite of programs referred to as 
BLAST programs. There are five implementations of BLAST, three designed for nucleotide 
sequences queries (BLASTN, BLASTX, and TBLASTX) and two designed for protein 
sequence queries (BLASTP and TBLASTN) (Coulson, Trends in Biotechnology, 12: 76-80 

15 (1994); Birren, et ai t Genome Analysis, 1: 543-559 (1997)). Additional programs are 
available in the art for the analysis of identified sequences, such as sequence alignment 
programs, programs for the identification of more distantly related sequences, and the like, 
and are well known to the skilled artisan. 

2 0 Plant Constructs and Methods of Use 

Of interest in the present invention, is the use of the nucleotide sequences, or 
polynucleotides, in recombinant DNA constructs to direct the transcription or transcription 
and translation (expression) of the acyltransferase sequences of the present invention in a host 
25 cell. 

Of particular interest is the use of the nucleotide sequences, or polynucleotides, in 
recombinant DNA constructs to direct the transcription or transcription and translation 
(expression) of the acyltransferase sequences of the present invention in a host cell. The 
expression constructs generally comprise a promoter functional in a host cell operably linked 
30 to a nucleic acid sequence encoding an acyltransferase of the present invention and a 
transcriptional termination region functional in a host cell. 

By "host cell" is meant a cell which contains a vector and supports the replication, 
and/or transcription or transcription and translation (expression) of the expression construct. 
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Host cells for use in the present invention can be prokaryotic cells, such as £. coli, or 
eukaryotic cells such as yeast, plant, insect, amphibian, or mammalian cells. Preferably, host 
cells are monocotyledenous or dicotyledenous plant cells. 

Of particular interest in the present invention is the use of the polynucleotides of the 
present invention for the preparation of constructs to direct the transcription or transcription 
and translation of the nucleotide sequences encoding an acyltransferase in a host plant cell. 
Plant expression constructs generally comprise a promoter functional in a plant host cell 
operably linked to a nucleic acid sequence of the present and a transcriptional termination 
region functional in a host plant cell. 

Those skilled in the art will recognize that there are a number of promoters which are 
functional in plant cells, and have been described in the literature. Chloroplast and plastid 
specific promoters, chloroplast or plastid functional promoters, and chloroplast or plastid 
operable promoters are also envisioned. 

One set of promoters are constitutive promoters such as the CaMV35S or FMV35S 
promoters that yield high levels of expression in most plant organs. Enhanced or duplicated 
versions of the CaMV35S and FMV35S promoters are useful in the practice of this invention 
(Odell, et al (1985) Nature 313:810-812; Rogers, U.S. Patent Number 5,378, 619). In 
addition, it may also be preferred to bring about expression of the protein of interest in 
specific tissues of the plant, such as leaf, stem, root, tuber, seed, fruit, etc., and the promoter 
chosen should have the desired tissue and developmental specificity. 

Of particular interest is the expression of the nucleic acid sequences of the present 
invention from transcription initiation regions which are preferentially expressed in a plant 
seed tissue. Examples of such seed preferential transcription initiation sequences include 
those sequences derived from sequences encoding plant storage protein genes or from genes 
involved in fatty acid biosynthesis in oilseeds. Examples of such promoters include the 5' 
regulatory regions from such genes as napin (Kridl et al. Seed ScL Res. 7:209:219 (1991)), 
phaseolin, zein, soybean trypsin inhibitor, ACP, stearoyl-ACP desaturase, soybean a' subunit 
of p-conglycinin (soy 7s, (Chen et al, Proc. Natl Acad. 5c/., 83:8560-8564 (1986))) and 
oleosin. 

It may be advantageous to direct the localization of proteins conferring acyltransferase 
to a particular subcellular compartment, for example, to the mitochondrion, endoplasmic 
reticulum, vacuoles, chloroplast or other plastidic compartment. For example, where the 
genes of interest of the present invention will be targeted to plastids, such as chloroplasts, for 
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expression, the constructs will also employ the use of sequences to direct the gene to the 
plastid. Such sequences are referred to herein as chloroplast transit peptides (CTP) or plastid 
transit peptides (FTP). In this manner, where the gene of interest is not directly inserted into 
the plastid, the expression construct will additionally contain a gene encoding a transit 
5 peptide to direct the gene of interest to the plastid. The chloroplast transit peptides may be 
derived from the gene of interest, or may be derived from a heterologous sequence having a 
CTP. Such transit peptides are known in the art. See, for example, Von Heijne et al (1991) 
Plant Mol Biol Rep. 9: 104-126; Clark et al (1989) J. Biol Chem. 264: 17544-17550; della- 
Cioppa et al (1987) Plant Physiol 84:965-968; Romer et al (1993) Biochem. Biophys. Res 

10 Commun. 796:1414-1421; and, Shah et al (1986) Science 253:478-481. Additional transit 
peptides for the translocation of the protein to the endoplasmic reticulum (ER), or vacuole 
may also find use in the constructs of the present invention. 

Depending upon the intended use, the constructs may contain the nucleic acid 
sequence which encodes the entire acyltransferase protein, or a portion thereof. For example, 

15 where antisense inhibition of a given acyltransferase protein is desired, the entire sequence is 
not required. Furthermore, where acyltransferase sequences used in constructs are intended 
for use as probes, it may be advantageous to prepare constructs containing only a particular 
portion of a acyltransferase encoding sequence, for example a sequence which is discovered 
to encode a highly conserved acyltransferase region. 

2 0 The skilled artisan will recognize that there are various methods for the inhibition of 

expression of endogenous sequences in a host cell. Such methods include, but are not limited 
to antisense suppression (Smith, et al (1988) Nature 334:724-726) , co-suppression (Napoli, 
etal (1989) Plant Cell 2:279-289), ribozymes (PCT Publication WO 97/10328), and 
combinations of sense and antisense, such as those described by Waterhouse, et al (1998) 
25 Proc. Natl Acad. Sci. USA 95:13959-13964. Methods for the suppression of endogenous 
sequences in a host cell typically employ the transcription or transcription and translation of 
at least a portion of the sequence to be suppressed. Such sequences may be homologous to 
coding as well as non-coding regions of the endogenous sequence. 

Regulatory transcript termination regions may be provided in plant expression 

3 0 constructs of this invention as well. Transcript termination regions may be provided by the 

DNA sequence encoding the acyltransferase or a convenient transcription termination region 
derived from a different gene source, for example, the transcript termination region which is 
naturally associated with the transcript initiation region. The skilled artisan will recognize 
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that any convenient transcript termination region which is capable of terminating transcription 
in a plant cell may be employed in the constructs of the present invention. 

Alternatively, constructs may be prepared to direct the expression of the 
acyltransferase sequences directly from the host plant cell plastid. Such constructs and 
methods are known in the art and are generally described, for example, in Svab, et al (1990) 
Proc. Natl Acad. Sci. USA 87:8526-8530 and Svab and Maliga (1993) Proc. Natl Acad. Sci. 
USA 90:913-917 and in U.S. Patent Number 5,693,507. 

A plant cell, tissue, organ, or plant into which the recombinant DNA constructs 
containing the expression constructs have been introduced is considered transformed, 
transfected, or transgenic. A transgenic or transformed cell or plant also includes progeny of 
the cell or plant and progeny produced from a breeding program employing such a transgenic 
plant as a parent in a cross and exhibiting an altered genotype resulting from the presence of 
an introduced acyltransferase nucleic acid sequence. 

The term "introduced" in the context of inserting a nucleic acid sequence into a cell, 
means "transfection", or "transformation" or "transduction" and includes reference to the 
incorporation of a nucleic acid sequence into a eukaryotic or prokaryotic cell where the 
nucleic acid sequence may be incorporated into the genome of the cell (for example, 
chromosome, plasmid, plastid, or mitochondrial DNA), converted into an autonomous 
replicon, or transiently expressed (for example, transfected mRNA). 

Plant expression or transcription constructs having an acyltransferase as the DNA 
sequence of interest for increased or decreased expression thereof may be employed with a 
wide variety of plant life, particularly, plant life involved in the production of vegetable oils 
for edible and industrial uses. Plants of interest in the present invention include 
monocotyledenous and dicotyledenous plants. Most especially preferred are temperate 
oilseed crops. Plants of interest include, but are not limited to, rapeseed (Canola and High 
Erucic Acid varieties), sunflower, safflower, cotton, soybean, peanut, coconut and oil palms, 
and corn. Depending on the method for introducing the recombinant constructs into the host 
cell, other DNA sequences may be required. Importantly, this invention is applicable to 
dicotyledyons and monocotyledons species alike and will be readily applicable to new and/or 
improved transformation and regulation techniques. 

As used herein, the term "plant" includes reference to whole plants, plant organs (for 
example, leaves, stems, roots, etc.), seeds, and plant cells and progeny of same. Plant cell, as 
used herein includes, without limitation, seeds suspension cultures, embryos, meristematic 
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regions, callus tissue, leaves roots shoots, gametophytes, sporophytes, pollen, and 
microspores. The class of plants which can be used in the methods of the present invention is 
generally as broad as the class of higher plants amenable to transformation techniques, 
including both monocotyledenous and dicotyledenous plants. Particularly preferred plants of 
5 interest include, but are not limited to, rapeseed (Canola and High Erucic Acid varieties), 
sunflower, safflower, cotton, soybean, peanut, coconut and oil palms, and corn. Most 
especially preferred plants include Brassica, soybean, and corn. 

As used herein, "transgenic plant" includes reference to a plant which comprises 
within its genome a heterologous polynucleotide. Generally, the heterologous polynucleotide 

10 is stably integrated within the genome such that the polynucleotide is passed on to successive 
generations. The heterologous polynucleotide may be integrated into the genome alone or as 
part of a recombinant expression cassette. 'Transgenic" is used herein to include any cell, 
cell line, callus, tissue, plant part or plant, the genotype of which has been altered by the 
presence of heterologous nucleic acid including those transgenics initially so altered as well 

15 as those created by sexual crosses or asexual propagation from the initial transgenic. 

Thus a plant having within its cells a heterologous polynucleotide is referred to herein 
as a transgenic plant. The heterologous polynucleotide can be either stably integrated into the 
genome, or can be extra-chromosomal. Preferably, the polynucleotide of the present 
invention is stably integrated into the genome such that the polynucleotide is passed on to 

2 0 successive generations. The polynucleotide is integrated into the genome alone or as part of a 

recombinant expression cassette. "Transgenic" is used herein to include any cell, cell line, 
callus, tissue, plant part or plant, the genotype of which has been altered by the presence of 
heterologous nucleic acids including those transgenics initially so altered as well as those 
created by sexual crosses or asexual reproduction of the initial transgenics. 
25 As used herein, "heterologous" in reference to a nucleic acid is a nucleic acid that 

originates from a foreign species, or, if from the same species, is substantially modified from 
its native form in composition and/or genomic locus by deliberate human intervention. For 
example, a promoter operably linked to a heterologous structural gene is from a species 
different from that from which the structural gene was derived, or, if from the same species, 

3 0 one or both are substantially modified from their original form. A heterologous protein may 

originate from a foreign species, or, if from the same species, is substantially modified from 
its original form by deliberate human intervention. 
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As used herein, a "recombinant expression cassette" is a nucleic acid construct, 
generated recombinantly or synthetically, with a series of specified nucleic acid elements 
which permit transcription of a particular nucleic acid in a target cell. The recombinant 
expression cassette can be incorporated into a plasmid, chromosome, mitochondrial DNA, 
5 plastid DNA, virus, or nucleic acid fragment. Typically, the recombinant expression cassette 
portion of an expression vector includes, among other sequences, a nucleic acid sequence to 
be transcribed and a promoter. 

It is contemplated that the gene sequences may be synthesized, either completely or in 
part, especially where it is desirable to provide plant-preferred sequences. Thus, all or a 
10 portion of the desired structural gene (that portion of the gene which encodes the 

acyltransferase protein) may be synthesized using codons preferred by a selected host. Host- 
preferred codons may be determined, for example, from the codons used most frequently in 
the proteins expressed in a desired host species. 

One skilled in the art will readily recognize that antibody preparations, nucleic acid 
15 probes (DNA and RNA) and the like may be prepared and used to screen and recover 
"homologous" or "related" acyltransferase from a variety of plant sources. Homologous 
sequences are found when there is an identity of sequence, which may be determined upon 
comparison of sequence information, nucleic acid or amino acid, or through hybridization 
reactions between a known acyltransferase and a candidate source. Conservative changes, 
2 0 such as Glu/Asp, Val/Ue, Ser/Thr, Arg/Lys and Gln/Asn may also be considered in 

determining sequence homology. Amino acid sequences are considered homologous by as 
little as 25% sequence identity between the two complete mature proteins. (See generally, 
Doolittle, R.F., OF URFS and ORFS (University Science Books, CA, 1986.) 

Thus, other acyltransferase sequences can be obtained from the specific exemplified 

2 5 sequences provided herein. Furthermore, it will be apparent that one can obtain natural and 

synthetic sequences, including modified amino acid sequences and starting materials for 
synthetic-protein modeling from the exemplified sequences and from acyltransferases which 
are obtained through the use of such exemplified sequences. Modified amino acid sequences 
include sequences which have been mutated, truncated, increased and the like, whether such 

3 0 sequences were partially or wholly synthesized. Sequences which are actually purified from 

plant preparations or are identical or encode identical proteins thereto, regardless of the 
method used to obtain the protein or sequence, are equally considered naturally derived. 
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For immunological screening, antibodies to the acyltransferase protein can be 
prepared by injecting rabbits or mice with the purified protein or portion thereof, such 
methods of preparing antibodies being well known to those in the art. Either monoclonal or 
polyclonal antibodies can be produced, although typically polyclonal antibodies are more 
5 useful for gene isolation. Western analysis may be conducted to determine that a related 
protein is present in a crude extract of the desired plant species, as determined by cross- 
reaction with the antibodies to the acyltransferase protein. When cross-reactivity is observed, 
genes encoding the related proteins are isolated by screening expression libraries representing 
the desired plant species. Expression libraries can be constructed in a variety of commercially 

10 available vectors, including lambda gtl 1, as described in Sambrook, et al (Molecular 

Cloning: A Laboratory Manual, Second Edition (1989) Cold Spring Harbor Laboratory, Cold 
Spring Harbor, New York). 

The nucleic acid sequences associated with acyltransferase proteins will find many 
uses. For example, recombinant constructs can be prepared which can be used as probes, or 

15 which will provide for expression of the acyltransferase protein in host cells to produce a 
ready source of the enzyme and/or to modify the composition of triglycerides found therein. 
Other useful applications may be found when the host cell is a plant host cell, either in vitro 
or in vivo. 

The modification of fatty acid compositions may also affect the fluidity of plant 

2 0 membranes. Different lipid concentrations have been observed in cold-hardened plants, for 

example. By this invention, one may be capable of introducing traits which will lend to chill 
tolerance. Constitutive or temperature inducible transcription initiation regulatory control 
regions may have special applications for such uses. 

As discussed above, nucleic acid sequence encoding an acyltransferase of this 
25 invention may include genomic, cDNA or mRNA sequence. By "encoding" is meant that the 
sequence corresponds to a particular amino acid sequence either in a sense or anti-sense 
orientation. By "extrachromosomal" is meant that the sequence is outside of the plant 
genome of which it is naturally associated. By "recombinant" is meant that the sequence 
contains a genetically engineered modification through manipulation via mutagenesis, 

3 0 restriction enzymes, and the like. 

Once the desired acyltransferase nucleic acid sequence is obtained, it may be 
manipulated in a variety of ways. Where the sequence involves non-coding flanking regions, 
the flanking regions may be subjected to resection, mutagenesis, etc. Thus, transitions, 
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transversions, deletions, and insertions may be performed on the naturally occurring 
sequence. In addition, all or part of the sequence may be synthesized. In the structural gene, 
one or more codons may be modified to provide for a modified amino acid sequence, or one 
or more codon mutations may be introduced to provide for a convenient restriction site or 
5 other purpose involved with construction or expression. The structural gene may be further 
modified by employing synthetic adapters, linkers to introduce one or more convenient 
restriction sites, or the like. 

The nucleic acid or amino acid sequences encoding an acyltransferase of this 
invention may be combined with other non-native, or "heterologous", sequences in a variety 

10 of ways. By "heterologous" sequences is meant any sequence which is not naturally found 
joined to the acyltransferase, including, for example, combinations of nucleic acid sequences 
from the same plant which are not naturally found joined together. 

The DNA sequence encoding an acyltransferase of this invention may be employed in 
conjunction with all or part of the gene sequences normally associated with the 

15 acyltransferase. In its component parts, a DNA sequence encoding acyltransferase is 

combined in a DNA construct having, in the 5 ? to 3* direction of transcription, a transcription 
initiation control region capable of promoting transcription and translation in a host cell, the 
DNA sequence encoding plant acyltransferase and a transcription and translation termination 
region. 

20 Potential host cells include both prokaryotic cells, such as E.coli and eukaryotic cells 

such as yeast, insect, amphibian, or mammalian cells. A host cell may be unicellular or found 
in a multicellular differentiated or undifferentiated organism depending upon the intended 
use. Preferably, host cells of the present invention include plant cells, both 
monocotyledenous and dicotyledenous. Cells of this invention may be distinguished by 

2 5 having a sequence foreign to the wild-type cell present therein, for example, by having a 

recombinant nucleic acid construct encoding an acyltransferase therein. 

The methods used for the transformation of the host plant cell are not critical to the 
present invention. The transformation of the plant is preferably permanent, i.e. by integration 
of the introduced expression constructs into the host plant genome, so that the introduced 

3 0 constructs are passed onto successive plant generations. The skilled artisan will recognize 

that a wide variety of transformation techniques exist in the art, and new techniques are 
continually becoming available. Any technique that is suitable for the target host plant can be 
employed within the scope of the present invention. For example, the constructs can be 
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introduced in a variety of forms including, but not limited to as a strand of DNA, in a 
plasmid, or in an artificial chromosome. The introduction of the constructs into the target 
plant cells can be accomplished by a variety of techniques, including, but not limited to 
calcium-phosphate-DNA co-precipitation, electroporation, microinjection, Agrobacterium 
5 infection, liposomes or microprojectile transformation. The skilled artisan can refer to the 
literature for details and select suitable techniques for use in the methods of the present 
invention. 

Normally, included with the DNA construct will be a structural gene having the 
necessary regulatory regions for expression in a host and providing for selection of 

10 transformant cells. The gene may provide for resistance to a cytotoxic agent, e.g. antibiotic, 
heavy metal, toxin, etc., complementation providing prototrophy to an auxotrophic host, viral 
immunity or the like. Depending upon the number of different host species the expression 
construct or components thereof are introduced, one or more markers may be employed, 
where different conditions for selection are used for the different hosts. 

15 Where Agrobacterium is used for plant cell transformation, a vector may be used 

which may be introduced into the Agrobacterium host for homologous recombination with T- 
DNA or the Ti- or Ri-plasmid present in the Agrobacterium host. The Ti- or Ri-plasmid 
containing the T-DNA for recombination may be armed (capable of causing gall formation) 
or disarmed (incapable of causing gall formation), the latter being permissible, so long as the 

2 0 vir genes are present in the transformed Agrobacterium host. The armed plasmid can give a 
mixture of normal plant cells and gall. 

In some instances where Agrobacterium is used as the vehicle for transforming host 
plant cells, the expression or transcription construct bordered by the T-DNA border region(s) 
will be inserted into a broad host range vector capable of replication in E. coli and 

2 5 Agrobacterium, there being broad host range vectors described in the literature. Commonly 
used is pRK2 or derivatives thereof. See, for example, Ditta, et a/., (Proa Nat. Acad. ScL, 
U.SA. (1980) 77:7347-7351) and EPA 0 120 515, which are incorporated herein by reference. 
Alternatively, one may insert the sequences to be expressed in plant cells into a vector 
containing separate replication sequences, one of which stabilizes the vector in E. coli, and 

30 the other in Agrobacterium. See, for example, McBride and Summerfelt {Plant Mol. Biol. 
(1990) 74:269-276), wherein the pRiHRI (Jouanin, et ai, Mol. Gen. Genet. (1985) 201:370- 
374) origin of replication is utilized and provides for added stability of the plant expression 
vectors in host Agrobacterium cells. 
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Included with the expression construct and the T-DNA will be one or more markers, 
which allow for selection of transformed Agrobacterium and transformed plant cells. A 
number of markers have been developed for use with plant cells, such as resistance to 
chloramphenicol, kanamycin, the aminoglycoside G418, hygromycin, or the like. The 
5 particular marker employed is not essential to this invention, one or another marker being 
preferred depending on the particular host and the manner of construction. 

For transformation of plant cells using Agrobacterium, explants may be combined and 
incubated with the transformed Agrobacterium for sufficient time for transformation, the 
bacteria killed, and the plant cells cultured in an appropriate selective medium. Once callus 

10 forms, shoot formation can be encouraged by employing the appropriate plant hormones in 
accordance with known methods and the shoots transferred to rooting medium for 
regeneration of plants. The plants may then be grown to seed and the seed used to establish 
repetitive generations and for isolation of vegetable oils. 

There are several possible ways to obtain the plant cells of this invention which 

15 contain multiple expression constructs. Any means for producing a plant comprising a 
construct having a nucleic acid sequence of the present invention, and at least one other 
construct having another DNA sequence encoding an enzyme are encompassed by the present 
invention. For example, the expression construct can be used to transform a plant at the same 
time as the second construct either by inclusion of both expression constructs in a single 

20 transformation vector or by using separate vectors, each of which express desired genes. The 
second construct can be introduced into a plant which has already been transformed with the 
first expression construct, or alternatively, transformed plants, one having the first construct 
and one having the second construct, can be crossed to bring the constructs together in the 
same plant. 

25 In general, acyltransferase proteins are active in the transfer of acyl groups from a 

donor to a variety of different substrates. For example, diacylglycerol acyltransferases add 
acyl groups to diacylglycerol to form triacylglycerol (TAG), oracyl:CoA:cholesterol 
acyltransferase uses an acyl-CoA as a donor to transfer an acyl group to a sterol to form a 
sterol ester. Typically, the substrates include, but are not limited to glycerides, including 

3 0 mono and diglycerides, sterols, stanols, phosphatides, and the like. Donors include, but are 
not limited to acyl-CoA and acyl-ACP molecules. 
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The invention now being generally described, it will be more readily understood by 
reference to the following examples which are included for purposes of illustration only and 
are not intended to limit the present invention. 

5 

EXAMPLES 



Example 1: RNA Isolations 



10 Total RNA from the inflorescence and developing seeds of Arabidopsis thaliana is 

isolated for use in construction of complementary (cDN A) libraries. The procedure is an 
adaptation of the DNA isolation protocol of Webb and Knapp (D.M. Webb and SJ. Knapp, 
(1990) Plant Molec. Reporter, 8, 180-185). The following description assumes the use of Ig 
fresh weight of tissue. Frozen seed tissue is powdered by grinding under liquid nitrogen. The 

15 powder is added to 10ml REC buffer (50mM Tris-HCl, pH 9, 0.8M NaCl, lOmM EDTA, 
0.5% w/v CTAB (cetyltrimethyl-ammonium bromide)) along with 0.2g insoluble 
polyvinylpolypyrrolidone, and ground at room temperature. The homogenate is centrifuged 
for 5 minutes at 12,000 xg to pellet insoluble material. The resulting supernatant fraction is 
extracted with chloroform, and the top phase is recovered. 

2 0 The RNA is then precipitated by addition of 1 volume RecP (50mM Tris-HCL pH9, 

lOmM EDTA and 0.5% (w/v) CTAB) and collected by brief centrifugation as before. The 
RNA pellet is redissolved in 0.4 ml of 1M NaCl. The RNA pellet is redissolved in water and 
extracted with phenol/chloroform. Sufficient 3M potassium acetate (pH 5) is added to make 
the mixture 0.3M in acetate, followed by addition of two volumes of ethanol to precipitate the 

25 RNA. After washing with ethanol, this final RNA precipitate is dissolved in water and stored 
frozen. 

Alternatively, total RNA may be obtained using TRIzol reagent (BRL- 
Lifetechnologies, Gaithersburg, MD) following the manufacturers protocol. The RNA 
precipitate is dissolved in water and stored frozen. 

30 

Example 2: Identification of Acyltransferase Homology Sequences 
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Searches are performed on a Silicon Graphics Unix computer using additional 
Bioaccellerator hardware and GenWeb software supplied by Compugen Ltd. This software 
and hardware enables the use of the Smith-Waterman algorithm in searching DNA and 
5 protein databases using profiles as queries. The program used to query protein databases is 
profilesearch. This is a search where the query is not a single sequence but a profile based on 
a multiple alignment of amino acid or nucleic acid sequences. The profile is used to query a 
sequence data set, i.e., a sequence database. The profile contains all the pertinent information 
for scoring each position in a sequence, in effect replacing the "scoring matrix" used for the 

10 standard query searches. The program used to query nucleotide databases with a protein 
profile is tprofilesearch. Tprofilesearch searches nucleic acid databases using an amino acid 
profile query. As the search is running, sequences in the database are translated to amino acid 
sequences in six reading frames. The output file for tprofilesearch is identical to the output 
file for profilesearch except for an additional column that indicates the frame in which the 

1 5 best alignment occurred. 

The Smith- Waterman algorithm, (Smith and Waterman (1981) supra), is used to 
search for similarities between one sequence from the query and a group of sequences 
contained in the database. E score values as well as other sequence information, such as 
conserved peptide sequences of HXXXXD and PEG are used to identify related sequences. 

2 0 By using the conserved peptide sequence information, E score values of greater than E-12 and 
E-8 are considered. For example, the EST sequence originally used to identify ATAT2 had 
an E score of 0.0094, while the EST sequence originally used to identify ATLPAAT1 had an 
E score of 0.0868. 

A protein sequence of glycerol-3-phosphate from£. coli (Swiss Prot Accession 

2 5 P00482) is used to search the NCBI non-redundant protein database using BLAST. In the 

first round of searches, other membrane forms of G3PAAT are identified. In subsequent PSI- 
BLAST searches (Altschul, et al ( 1997) Nucleic Acids Res 25:3389-3402), LPAATs and 
other acyltransferases are identified. Using sequence alignment software programs, G3PAAT 
and different LPAAT amino acid sequences are aligned, and a profile is generated using a 

3 0 homologous sequence region, between amino acids 256 and 459 of the E. coli sequence. 

The identified 204 amino acid is used to query the protein database using PSI-BLAST. 
After 5 iterations of PSI-BLAST, the profile generated from this new query (Figure 1) 
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identified soluble forms of G3PAAT. Prior to this identification, no sequence homology had 
been identified between the membrane and soluble forms of G3PAAT. 



5 Example 3: Excision of PSI-BLAST Profile 

The profile generated from the queries using PSI-BLAST is excised from the hyper 
text markup language (html) file. The worldwide web (www)/html interface to psiblast at 
ncbi stores the current generated profile matrix in a hidden field in the html file that is 
10 returned after each iteration of psiblast. However, this matrix has been encoded into string62 
(s62) format for ease of. transport through html. String62 format is a simple conversion of the 
values of the matrix into html legal ascii characters. 

The encoded matrix width (x axis) is 26 characters, and comprise the consensus 
characters, the probabilities of each amino acid in the order A,B,C,D,E,F,G,H,I,K,L,M,N, 
15 P,Q,R,S,T,V,W,X,Y,Z (where B represents D and N, and Z represents Q and E, and X 
represents any amino acid), gap creation value, and gap extension value. 

The length (y axis) of the matrix corresponds to the length of the sequences identified 
by PSI-BLAST. The order of the amino acids corresponds to the conserved amino acid 
sequence of the sequences identified using PSI-BLAST, with the N-terminal end at the top of 

2 0 the matrix. The probabilities of other amino acids at that position are represented for each 

amino acid along the x axis, below the respective single letter amino acid abbreviation. 

Thus, each row of the profile consists of the highest scoring (consensus) amino acid, 
followed by the scores for each possible amino acid at that position in sequence matrix, the 
score for opening a gap that that position, and the score for continuing a gap at that position. 
25 The string62 file is converted back into a profile for use in subsequent searches. The 

gap open field is set to 1 1 and the gap extension field is set to 1 along the x axis. The gap 
creation and gap extension values are known, based on the settings given to the PSI-BLAST 
algorithm. The matrix is exported to the standard GCG profile form. This format can be read 
by Gen Web. 

3 0 The algorithm used to convert the string62 formatted file to the matrix is outlined in 

Table 1. 
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Table 1 

1. if encoded character z then the value is blast score min 

2. if encoded character Z then the value is blast score max 

3. else if the encoded character is uppercase then its value is (64-(ascii # of char)) 

4. else if the encoded character is a digit the value is ((ascii # of char)-48) 

5. else if the encoded character is not uppercase then the value is ((ascii # of char) - 87) 

6. ALL B positions are set to min of D and N amino acids at that row in sequence matrix 

7. ALL Z positions are set to min of Q amd E amino acids at that row in sequence matrix 

8. ALL X positions are set to min of all amino acids at that row in sequence matrix 

9. kBLAST_SCORE_MAX=999; 

10. kBLAST_SCORE_MIN=-999; 

1 1 . all gap opens are set to 1 1 

12. all gap lens are set to 1 



15 



Example 4: Identification of Novel Acyltransferase Related Amino Acid Sequences 

2 0 The profile (Figure 1 ) is used in further queries to identify a number of previously 

unidentified proteins from yeast as novel acyltransferases. A protein is identified from an 
Arabidopsis protein sequence database (ATAT1) (SEQ ID NO:2). Sequences are also 
identified from nucleic acid databases (Table 2) 

25 Table 2 



Database ID Number 


BLAST Search Hits 


Log probability 


Saccharomvces cerevisiae 






gi 1078509 


Limnanthes putative LPAAT 


e-10(SEQID 


NO:217) 






gi 586485 


Limnanthes putative LPAAT 


e-13 (SEQ ED 


NO:218) 
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gi 320748 


Limnanthes putative LPAAT 


e-19(SEQID 


NO:219) 






gi 2506920 


SUPPRESSES CTR1 (choline transport mutant) (SEQ ID NO:220) 


gi 549627 


similar to CTR1 


e-118(SEQID 


NO:221) 






gi 2133031 


unidentified 


(SEQ ID 


NO:222) 






gi 2132939 


unidentified 


(SEQ ID 


NO:223) 






gi 2132299 


TAFAZZIN 


e-14 (SEQ ID 


NO:224) 







In Table 2, the gi number is the database identifier, the middle column shows the 
results of BLAST searches against the NCBI NR protein database, and the log probability 

15 number shows represents the log of the probability of such a match occurring by random 
chance. These proteins, including the AT ATI protein sequence, are identified using the 
original PSI-BLAST search of the NCBI NR protein database. Thus, these proteins are novel 
acyltransferase related proteins with unidentified activities. 

The Arabidopsis acyltransferase sequence, herein referred to as ATAT1, is also 

2 0 identified using the original PSI-BLAST search of the NCBI NR protein database, and did not 
have an annotated function. 

Additional Arabidopsis amino acid sequences related to acyltransferases are identified 
from the databases, referred to as ATAT2est, ATAT3est, ATAT4est, ATAT5est, ATAT6est, 
ATAT7est, ATAT8est, ATAT9, AT AT 10, and ATATllest. Furthermore, Arabidopsis 

25 amino acid sequences are identified which demonstrate sequence similarity to known 

lysophosphatidic acid, referred to as ATLPAAT1. The sequences of ATAT9 and AT AT 10 
are identified from the database as genomic sequences, all other Arabidopsis sequences are 
identified as ESTs. 

30 
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To obtain the entire coding region corresponding to thtArabidopsis acyltransferase 
sequences, synthetic oligo-nucleotide primers are designed to amplify the 5' and 3' ends of 
partial cDNA clones containing acyltransferase related sequences. Primers are designed 
according to the respective Arabidopsis acyltransferase related sequences (Table 3) and used 
in Rapid Amplification of cDNA Ends (RACE) reactions (Frohman et al (1988) Proc. Natl. 
Acad. Sci. USA 85:8998-9002) using the Marathon cDNA amplification kit (Clontech 
Laboratories Inc, Palo Alto, CA). Primers with an R designation are used for 5* RACE 
reactions, and primers with an F designation are used for 3' RACE reactions. 
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ATAT2 

ATAT2R1 CCATCCGCTTCAAGGGAACGACACCCATCA (SEQ ID NO: 135) 

ATAT2R2 TCCCTGTCTTGCTTGATGAACTTAAAGCTTG (SEQ ID NO: 136) 

ATAT2R3 ACAGCAGGAGTGTCTGATGATGGCAGATTC (SEQ ID NO: 137) 

ATAT3 



ATAT3R1 ACTGGAGTTCCAGCCAAAAATGCACCTGTC (SEQ ID NO: 138) 
ATAT3R2 GATACACCCTTGAAATCAGGCGATTTTGCT (SEQ ID NO: 139) 

ATAT4 



ATAT4R1 TTGCAAATTCAATTCCTGTTTCACCGGGCC (SEQ ID NO: 140) 
ATAT4R2 GTTTTCTGCTATTCCAGAAGGCGTCAACAA (SEQ ID NO: 141 ) 

ATAT5 



ATAT5R 1 CATTGA AGATCCGTCCGTGAAGTTNCCTT ACC (SEQ ID NO: 142) 

ATAT5R2 TCGAGCTGTGATCGATGATTGGCTGTGAAG (SEQ ID NO: 143) 

ATAT5F1 GTCTCTTCAAAAACACACACACACGTCTCT (SEQ ID NO: 144) 

ATAT5F2 GTCTCTTCAAAAACACACACACACGTCTCT (SEQ ID NO: 145) 

ATAT6 



H76348-F1 GTAGAGAGCCTTACTTGCTTCGGTTTAGTC (SEQ ID NO: 146) 
H76348-F2 ACGTCATCGTACCTGTTGCTATTGACTCAC (SEQ ID NO: 147) 
H76348-R1 ACTTTTCCATTGTCAGGGACTCCTCGACAC (SEQ ID NO: 148) 
H76348-R2 ACGGTGTAGGAAGGGAAAGGATTCAAAAGG (SEQ ID NO: 149) 

ATAT7 



ATTS0193-F1 GCGATGAACTACAGAGTCGGATTCTTCCTC (SEQ ID NO: 150) 
ATTS0193-F2 CCGGTTTACGAGATTACGTTCTTGAACCAG (SEQ ID NO: 151) 
ATTS0193-R1 CAATGGAGACAAGGCTCGAAAGTGCTAACC (SEQ ID NO: 152) 
ATTS0193-R2 ATTCTCTGAACATAGTTCGCCACGGTCATG (SEQ ID NO: 153) 
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ATAT8 

AA042618-F1 GAAATCCAACGCCTTCCCAATATCACTCTG (SEQ ID NO: 154) 
AA042618-F2 CTTCAACTTTCCATCAGGATCTTGGCACGT (SEQ ID NO: 155) 
AA042618-R1 ACCACTTGTTAGAGACCTTACCTGCTTAGG (SEQ ID NO: 156) 
AA042618-R2 TCCTACCTACACCATCCAATTTCTCGACCC (SEQ ID NO: 157) 

AT ATI 1 



AT ATI 1R1 CTGCGTCAAGTGAGCAACTCAGTTCTTGCA (SEQ ID NO: 158) 
AT ATI 1R2 TGGGAAGCAGCACGTTGTTCAGTATCGGAA (SEQ ID NO: 159) 
ATAT 1 1R3 TAGCCTCTGTGTAATCTGTGCCCTCGGGGA (SEQ ID NO: 1 60) 



From the nucleic acid sequences obtained from the RACE reactions, protein sequence 
is predicted for each nucleic acid sequence using Macvector software. Nucleic acid sequences 
15 are provided for ATAT1 (SEQ ID NO: 1 ), ATAT2 (SEQ ID NO:3), ATAT3 (SEQ ID NO:5), 
ATAT4 (SEQ ID NO:7), ATAT5 (SEQ ID NO:9), ATAT6 (SEQ ID NO: 10), ATAT7 (SEQ 
ID NO:12), ATAT8 (SEQ ID NO:14), ATAT9 (SEQ ID NO:16), ATAT 10 (SEQ ID NO:18), 
ATAT1 1 (SEQ ID NO:20) and ATLPAAT1 (SEQ ID NO:22), respectively. 

The protein sequence derived from the ATAT1 (SEQ ID NO:2) nucleic acid sequence 
2 0 from Arabidopsis has a predicted molecular mass of 32.5 kDa, and a PI of 9.74. Alignment 
of the Arabidopsis acyltransferase with several LPAAT and G3PAAT shows that some of the 
domains that are conserved between LPAAT and G3PAAT are conserved in the new 
acyltransferase protein. 

The ATAT2 nucleic acid sequence is predicted to encode a 312 amino acid protein 

2 5 (SEQ ID NO:4), with a molecular weight of 34.6 kD, and a pi of 9.99. The ATAT2 protein 

may also contain 2 to 3 transmembrane domains. However, the protein encoded by the 
ATAT2 nucleic acid sequence may be longer than predicted because of the absence of an 
inframe stop codon upstream of the ATG start codon used. 

The ATAT3 nucleic acid sequence is predicted to encode a 398 amino acid protein 

3 0 (SEQ ID NO:6), with a molecular weight of 44.7 kD, and a pi of 5.62. The ATAT3 protein 

may contain 1 to 4 transmembrane domains. The ATAT4 nucleic acid sequence is predicted 
to encode a 317 amino acid protein (SEQ ID NO:8), with a molecular weight of 36.5 kD, and 
a pi of 9.67. The ATAT4 protein is predicted to have 2 to 5 transmembrane domains. 
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The ATLPAAT1 nucleic acid sequence is predicted to encode a 389 amino acid 
protein (SEQ ID NO:23), with a molecular weight of 43.7 kD, and a pi of 9.52. The 
ATLPAAT1 protein is predicted to have up to 3 transmembrane domains. The protein 
predicted from the ATLPAAT1 nucleic acid sequence is similar toLPAATs reported for 
5 Brassica, maize, and meadowfoam (described in PCT Publication WO 94/13814). The 
AT ATI 1 nucleic acid sequence is predicted to encode a 375 amino acid protein (SEQ ID 
NO:21), with a molecular weight of 43.5 kD, and a pi of 9.45. The deduced amino acid 
sequences of ATAT6 (SEQ ID NO: 11), ATAT7 (SEQ ID NO: 13), ATAT8 (SEQ ID NO: 15), 
ATAT9 (SEQ ID NO:17), and AT AT 10 (SEQ ID NO:19) are also provided 

10 A sequence region approximately 30 amino acids upstream through approximately 

100 amino acids downstream of the conserved amino acid sequences HXXXXD (Heath and 
Rock, (1998) J. BacterioL 180(6): 1425-1430) and PEG (Neuwald (1997) CurrBiol 7:R465- 
R466) of the predicted amino acid sequences derived from the nucleic acid sequences of 
ATAT1, ATAT2, ATAT3, ATAT4, ATAT6, ATAT7, ATAT8, ATAT9, AT AT 10, 

15 ATLPAAT1 , and AT ATI 1 are compared to the amino acid sequences of lysophosphatidic 
acid acyltransferase (Jojoba AT (SEQ ID NO: 162, the nucleic acid sequence is provided in 
SEQ ID NO:161), maize AT (PCT Publication WO 94/13814), PLSC coco(GenBank 
accession 1098605), PLSC Lim(GenBank accession 1209507), PLSC.Ecoli (GenBank 
accession 1209507), and PLSC Yeast(GenBank accession 464422)) and glycerol-3-phosphate 

2 0 acyltransferase (PLSB Ecoli(GenBank accession 130326) and PLSB Mouse(GenBank 
accession 2498786)) (Figure 2), and similarities are identified (Figure 2 and Figure 3). 

Sequence comparisons reveal several classes of acyltransferases exist based on 
conserved amino acid sequences identified in the comparisons in Figure 2. For example, 
ATAT1, ATAT6, ATAT7, ATAT8, and ATAT9, contain the conserved amino acid 

2 5 sequences of VTYSXS(SEQ ID NO: 128), VXLTRXR(SEQ ID NO: 129), LXXGDLV(SEQ 

ID NO: 132) between the HXXXXD and PEG sequences. In addition, AT ATI, ATAT6, 
ATAT7, ATAT8, and ATAT9 also contain the conserved sequences CPEGT(SEQ ID NO: 
130) which comprises the PEG sequence, as well as IVPVA(SEQ ID NO: 131) and 
VANXXQ (SEQ ID NO: 134)(Figure 2) downstream of the PEG sequence. The sequences 

3 0 corresponding to ATAT1, ATAT7, and ATAT9 are the most closely related in this class, with 

similarities between AT ATI and ATAT9 of 67.0%, between AT ATI and ATAT7 of 58.2% 
and between ATAT9 and ATAT7 of 63.9% (Figure 3B). 
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Sequence comparisons also demonstrate that the sequence of ATLPAAT1 is most 
closely related to the jojoba LPAAT (82.3% similar), and maize (78.0% similar). 

Furthermore, sequence analysis demonstrates that ATAT4 is the most divergent 
sequence with the highest similarity to AT AT 10 (18.5%). The highest similarity (15.3%) to a 
5 known sequence is with a meadowfoam (Limnanthes douglassi) LPAAT. However, the 

sequences of ATAT4 and AT AT 10 share several conserved peptide sequences with the amino 
acid sequences of ATAT2 and ATAT3 (Figure 2), VXNHXS (SEQ ID NO: 127) where the H 
comprises the conserved H of the HXXXXD sequence and FXXGAF (SEQ ID NO: 133) 
downstream of the PEG sequence. 

10 

Example 6: Identification of Additional Acyltransferase Sequences • 

The novel Arabidopsis sequences identified above are used to search proprietary 
15 databases containing soybean and corn EST sequences. The results of this search identifies 
EST sequences from soybean (SEQ ID NO:24 through SEQ ID NO: 85) as well as from corn 
(SEQ ID NO: 86 through SEQ ID NO: 126) as encoding acyltransferase related proteins. 

Sequence comparisons between the various EST sequences and the complete 
Arabidopsis sequences reveals that the identified EST sequences demonstrate higher 

2 0 similarity to the various Arabidopsis sequences as determined by BLAST scores. 

Expressed Sequence Tag (EST) sequences from soybean and corn databases are 
identified which are most closely related by BLAST score to AT ATI (SEQ ID NOS:24-29 
and SEQ ID NOS:86-88, respectively), ATAT2 (SEQ ID NO: 30 and SEQ ID NO:89, 
respectively), ATAT3 (SEQ ID NOS:31-35 and SEQ ID NOS:90-94, respectively), ATAT4 
2 5 (SEQ ID NOS:36-44 and SEQ ID NOS:95- 100, respectively), ATAT6 (SEQ ID NOS:45-49 
and SEQ ID NO:101, respectively), ATAT7 (SEQ ID NOS:50-54 and SEQ ID NOS:102-103, 
respectively), ATAT8 (SEQ ID NOS:55-56 and SEQ ID NO: 104, respectively), ATAT9 
(SEQ ID NOS:57-79 and SEQ ID NOS: 105-1 1 1, respectively), AT AT 10 (SEQ ID NOS:80- 
81 and SEQ ID NO:l 12, respectively), AT ATI 1, (SEQ ID NOS:82-85 and SEQ ID 

3 0 NOS: 123-126, respectively), and ATLPAAT1 (SEQ ID NOS: 1 13-122 respectively). 
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Example 7: Expression Construct Preparation 

A series of synthetic oligo nucleotide primers were prepared for use in Polymerase 
Chain Reactions (PCR) to amplify the entire DNA sequences encoding the various 
acyltransferase sequences identified above. The sequences are listed in Table 3. 



Table 3 



Primer Sequence (listed 5'-3') 



SEQ ZD 
NO: 



ATAT1F AAGCTTGCATGCGTCGACAC AATGGTTCATGCGACCAAGT 163 
CAG 

ATAT1R GGTACCGTCGACTCACTTCTTGGTGTTGTTGATAG 164 

ATAT2F GGATCCGCGGCCGCACAATGACGAGCTTTACTACTTCCCT 165 
TCAT 

ATAT2R GGATCCCCTGCAGGTTAGAGATCCATTGATTCTGCAAT 166 

ATAT3 F GGATCCGCGGCCGC ATAATGGAATCAGAGCTCAAAGAT 167 

ATAT3R GGATCCCCTGCAGGTCATTCTTCTTTCTGATGGAAATC 168 

ATAT4F GGATCCGCGGCCGCACAATGACTCGTTCACAAGATGTTTC 169 
A 

ATAT4R GGATCCCCTGCAGGTCACTTCTCTTCCAATCTAGCCAG 170 

AT AT 6 F GGATCCGCGGCCGCACAATGTCCGGTAATAAGATCTCGAC 171 
TCTTCA 

AT AT 6 R GGATCCCCTGCAGGTTATTTTTTCTTGACAACTCCGTTAT 172 
TACCGG 

ATAT7 F ATATCCGCGGCCGCAC AATGGTTATGGAGC AAGCTGGAA 173 

ATAT7R GGATCCCCTGCAGGTCAATGGAGACAAGGCTCGAAAGT 174 

ATAT8F . GGATCCGCGGCCGC AC AATGTCCGCCAAGATTTCAATATT 175 
CC 

ATAT8R GGATCCCCTGCAGGTTAATTTTTCTTAACTACTCCATT 17 6 

ATAT9F GGATCCGCGGCCGCACAATGGGAGCTCAGGAGAAACGGCG 177 
CC 

AT AT 9 R GGATCCCCTGCAGGTCACGTCTTCTCCTTCTTCACCGG 17 8 

ATAT1 OF GGATCCGCGGCCGC ACAATGGCGGATCCTGATCTGTCTTC 179 
TCCT 

ATAT1 OR GGATCCCCTGCAGGTTATGTTGGGGCCAAGTCAGGTGCAA 180 
AGAT 

AT ATI IF GGATCCGCGGCCGCAAAATGGAAAAAAAGAGTGTACCAAA 181 
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TTCT 




ATAT11R 


GGATCCCCTGCAGGTTATTTGTTTACTAATTTGAGGGAAT 


182 




TTTTTG 




ATLPAAT 


TCGACCTGCAGGAAGCTTAAGGATGGTGATTGCTGC 


183 


IF 






ATLPAAT 


GGATCCGCGGCCGCTTACTTCTCCTTCTCCG 


184 


1R 






YSCAT1F 


GGATCCGCGGCCGCACAATGTCTTTTAGGGATGTCCTAG 


185 


YSCAT1R 


GGATCCCCTGCAGGTCAATCATCCTTACCCTTTGGTTTAC 


186 


YSCAT 1 


C 

ATGTCTTTTAGGGATGTCCTAGAAAGAGGAGATGAATTTT 


187 


KO F 


CTGTGCGGTATTTCACACCG 




YSCAT 1 


TCAATCATCCTTACCCTTTGGTTTACCCTCTGGAGGCAGA 


188 


KO R 


AGATTGTACTGAGAGTGCAC 




YSCAT2F 


GGATCCGCGGCCGCACAATGAAGCATTCCCAAAAATACCG 


189 




TAGG 




YSCAT2R 


GGATCCCCTGCAGGTCAATGATTTTTTTTCATCACAAATA 


190 


YSCAT 2 


C 

ATGAAGCATTCCCAAAAATACCGTAGGTATGGAATTTATG 


191 


KO F 


CTGTGCGGTATTTCACACCG 




YSCAT 2 


TCAATGATTTTTTTTCATCACAAATACAAGAATAAGAAAA 


192 


KO R 


AGATTGTACTGAGAGTGCAC 




YSCAT 


GGATCCGCGGCCGCACAATGGGTTTTGTTGATTTCTTCGA 


193 


3F 


AAC 




YSCAT 


GGATCCCCTGCAGGTTATTTGGTCTCAATTTTAATATTTT 


194 


3R 


TTTGC 




YSCAT 3 


ATGGGTTTTGTTGATTTCTTCGAAACATATATGGTCGGTT 


195 


KO F 


CTGTGCGGTATTTCACACCG 




YSCAT 3 


TTATTTGGTCTCAATTTTAATATTTTTTTGCAAGGACTCG 


196 


KO R 


AGATTGTACTGAGAGTGCAC 




YSCAT 


GGATCCGCGGCCGCACAATGGAAAAGTACACCAATTGGAG 


197 


4F 


AGAC 




YSCAT 


GGATCCCCTGCAGGCTACTTCCTCTTTTTACGTTGATCGC 


198 


4R 


TG 




YSCAT 4 


ATGGAAAAGTACACCAATTGGAGAGACAATGGTACGGGAA 


199 
j. •* j 


KO F 


CTGTGCGGTATTTCACACCG 




YSCAT 4 


CTACTTCCTCTTTTTACGTTGATCGCTGATATATTCCTTC 


200 


KO R 


AGATTGTACTGAGAGTGCAC 
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YSCAT 




GGATCCGCGGCCGCACAATGCCTGCACCAAAACTCACGGA 


201 


5F 




G 




YSCAT 




GGATCCCCTGCAGGCTACGCATCTCCTTCTTTCCCTTC 


202 


5R 

YSCAT 


5 


ATGCCTGCACCAAAACTCACGGAGAAATCTGCCTCTTCCA 


203 


KO F 




CTGTGCGGTATTTCACACCG 




YSCAT 


5 


CTACGCATCTCCTTCTTTCCCTTCTTCTTCTTCTTCCTCT 


204 


KO R 




AGATTGTACTGAGAGTGCAC 




YSCAT 




GGATCCGCGGCCGCACAATGTCTGCTCCCGCTGCCGATCA 


205 


6F 




TAACGC 




YSCAT 




GGATCCCCTGCAGGTCATTCTTTCTTTTCGTGTTCTCTTT 


206 


6R 




TCTG 




VQPAT 
X Dv.nl 


o 


ATrvrpTppnv'ppnPTftppftATPATAAPfiPTfiPPAAAPPTA 


207 


KO F 




CTGTGCGGTATTTCACACCG 




YSCAT 


6 


TCATTCTTTCTTTTCGTGTTCTCTTTTCTGTCTTACCAGC 


208 


KO R 




AGATTGTACTGAGAGTGCAC 




YSCAT 




GGATCCGCGGCCGCACAATGCTGCATCAAAAAATAGCTCA 


209 


7F 




TAAAGTTCG 




YSCAT 




GGATCCCCTGCAGGTCAAAAAATAAAACAATAAAGTTTAT 


210 


7R 




AAACTAACC 




YSCAT 


7 


ATGCTGCATCAAAAAATAGCTCATAAAGTTCGAAAAGTCG 


211 


KO F 




CTGTGCGGTATTTCACACCG 




YSCAT 


7 


TCAAAAAATAAAACAATAAAGTTTATAAACTAACCAAATT 


212 


KO R 




AGATTGTACTGAGAGTGCAC 




YSCAT 




GGATCCGCGGCCGCACAATGAGTGTGATAGGTAGGTTCTT 


213 


8F 




G 




YSCAT 




GGATCCCCTGCAGGTTAATGCATCTTTTTTACAGATGAAC 


214 


8R 




C 




YSCAT 


8 


ATGAGTGTGATAGGTAGGTTCTTGTATTACTTGAGGTCCG 


215 


KO F 




CTGTGCGGTATTTCACACCG 




YSCAT 


8 


TTAATGCATCTTTTTTACAGATGAACCTTCGTTATGGGTA 


216 


KO R 




AGATTGTACTGAGAGTGCAC 





The entire coding regions for each of the acyltransfeiase sequences were amplified 
using the respective primers listed in the Table 3 above, cloned into the vector pCR2.1Topo 
(Invitrogen) orpZero (Invitrogen), and labeled as pCGN8558 (ATAT1), pCGN8564 
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(ATAT2), pCGB8565 (ATAT3), pCGN8566 (ATAT4), P CGN8918 (ATAT6), 
pCGN8913 (ATAT7), pCGN8904 (ATAT8), pCGN9970 (ATAT9), pCGN9940 
(ATAT10), pCGN8567 (ATAT11), pCGN8632 (ATLPAAT1), pCGN9901 (YSCAT1 
also referred to as gi2132299), pCGN9902 (YSCAT2, also referred to as gi 1078509), 
pCGN9903 (YSCAT3, also referred to as gi2132939), pCGN9904 (YSCAT4, also 
referred to gi213303l), pCGN9905 (YSCAT5, also referred to as gi320748), pCGN9906 
(YSCAT6, also referred to as gi549627), pCGN9907 (YSCAT7, also referred to as 
gi586485), and pCGN9908 (YSCAT8, also referred to as gi464422). The nucleic acid 
sequences for the respective yeast aeyltransferase are provided YSCAT1 (SEQ ID 
NO:225), YSCAT2 (SEQ ID NO:226), YSCAT3 (SEQ ID NO:227), YSCAT4 (SEQ ID 
NO:228), YSCAT5 (SEQ IDNO:229), YSCAT6 (SEQ ID NO:230), YSCAT7 (SEQ ID 
NO:231), and YSCAT8 (SEQ ID NO:232). 
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7 A. Baculovirus Expression Constructs 

Constructs are prepared to direct the expression of the Arabidopsis ATAT sequences 
in cultured insect cells. The entire coding regions of AT ATI, 2, 3, 4, 6, 7, 8, 9, 10, and 1 1 are 
cloned into the vector pFastBacl (Gibco-BRL, Gaithersburg, MD) digested withMtfl and 
5 Pstl. The respective coding sequences were cloned as Notl/Sse%3SlI fragments. Double 
stranded DNA sequence was obtained to verify that no errors were introduced by PCR 
amplification. The resulting plasmid were designated pCGN9723 (AT ATI), pCGN9724 
(ATAT2), pCGN9725 (ATAT3), pCGN9726 (ATAT4), pCGN9727 (ATAT5), pCGN9728 
(ATAT7), pCGN9729 (ATAT8), pCGN9991 (ATAT9) pCGN9730 (ATAT 10), pCGN9731 
10 (ATAT11). 

7B. Plant Expression Construct Preparation 

A plasmid containing the napin cassette derived from pCGN3223 (described in USPN 
5,639,790, the entirety of which is incorporated herein by reference) was modified to make it 
15 more useful for cloning large DNA fragments containing multiple restriction sites, and to 

allow the cloning of multiple napin fusion genes into plant binary transformation vectors. An 
adapter comprised of the self annealed oligonucleotide of sequence 

CGCGATTTAAATGGCGCGCCCTGCAGGCGGCCGCCTGCAGGGCGCGCCATTTAA 

(SEQ ID NO:233) AT was ligated into the cloning vector pBC SK+ (Stratagene) after 
2 0 digestion with the restriction endonuclease BssHII to construct vector pCGN7765. Plamids 

pCGN3223 and pCGN7765 were digested with NotI and ligated together. The resultant 

vector, pCGN7770, contains the pCGN7765 backbone with the napin seed specific 

expression cassette from pCGN3223. 

The cloning cassette, pCGN7787, essentially the same regulatory elements as 
25 pCGN7770, with the exception of the napin regulatory regions of pCGN7770 have been 

replaced with the double CAMV 35S promoter and the tml polyadenylation and 

transcriptional termination region. 

A binary vector for plant transformation, pCGN5139, was constructed from 

pCGN1558 (McBride and Summerfelt, (1990) Plant Molecular Biology, 14:269-276). The 
30 poly linker of pCGN1558 was replaced as a HindIII/Asp718 fragment with apolylinker 

containing unique restriction endonuclease sites, AscI, Pad, Xbal, Swal, BamHI,and NotI. 

The Asp718 and Hindin restriction endonuclease sites are retained in pCGN5139. 
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A series of turbo binary vectors are constructed to allow for the rapid cloning of DN A 
sequences into binary vectors containing transcriptional initiation regions (promoters) and 
transcriptional termination regions. 

The plasmid pCGN8618 was constructed by ligating oligonucleotides 5*- 
5 TCGAGGATCCGCGGCCGCAAGCTTCCTGCAGG-3' ) (SEQ ID NO:234) and 5*- 

TCGACCTGCAGGAAGCTTGCGGCCGCGGATCC-3' ) (SEQ ID NO:235) into Sall/Xhol- 
digested pCGN7770. A fragment containing the napin promoter, poly linker and napin 3* 
region was excised from pCGN8618 by digestion with Asp718I; the fragment was blunt- 
ended by filling in the 5' overhangs with Klenow fragment then ligated into pCGN5139 that 

10 had been digested with Asp718I and Hindlll and blunt-ended by filling in the 5' overhangs 
with Klenow fragment. A plasmid containing the insert oriented so that the napin promoter 
was closest to the blunted Asp718I site of pCGN5139 and the napin 3' was closest to the 
blunted Hindlll site was subjected to sequence analysis to confirm both the insert orientation 
and the integrity of cloning junctions. The resulting plasmid was designated pCGN8622. 

15 The plasmid pCGN8619 was constructed by ligating oligonucleotides 5'- 

TCGACCTGCAGGAAGCTTGCGGCCGCGGATCC -3' ) (SEQ ID NO:236) and 5*- 
TCGAGGATCCGCGGCCGCAAGCTTCCTGCAGG-3' ) (SEQ ID NO:237) into Sall/Xhol- 
digested pCGN7770. A fragment containing the napin promoter, poly linker and napin 3' 
region was removed from pCGN8619 by digestion with Asp718I; the fragment was blunt- 

20 ended by filling in the 5' overhangs with Klenow fragment then ligated into pCGN5139 that 
had been digested with Asp718I and Hindlll and blunt-ended by filling in the 5' overhangs 
with Klenow fragment. A plasmid containing the insert oriented so that the napin promoter 
was closest to the blunted Asp718I site of pCGN5139 and the napin 3' was closest to the 
blunted Hindlll site was subjected to sequence analysis to confirm both the insert orientation 

2 5 and the integrity of cloning junctions. The resulting plasmid was designated pCGN8623. 

The plasmid pCGN8620 was constructed by ligating oligonucleotides 5'- 
TCGAGGATCCGCGGCCGCAAGCTTCCTGCAGGAGCT -3' ) (SEQ ID NO:238) and 5'- 
CCTGCAGGAAGCTTGCGGCCGCGGATCC-3' ) (SEQ ID NO:239) into Sall/Sacl- 
digested pCGN7787. A fragment containing the d35S promoter, polylinker and tml 3' region 

3 0 was removed from pCGN8620 by complete digestion with Asp718I and partial digestion with 

Notl. The fragment was blunt-ended by filling in the 5' overhangs with Klenow fragment 
then ligated into pCGN5139 that had been digested with Asp718I and Hindlll and blunt- 
ended by filling in the 5' overhangs with Klenow fragment. A plasmid containing the insert 
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oriented so that the d35S promoter was closest to the blunted Asp718I site of pCGN5139 and 
the tml 3' was closest to the blunted Hindlll site was subjected to sequence analysis to 
confirm both the insert orientation and the integrity of cloning junctions. The resulting 
plasmid was designated pCGN8624. 
5 The plasmid pCGN8621 was constructed by ligating oligonucleotides 5'- 

TCGACCTGCAGGAAGCTTGCGGCCGCGGATCCAGCT -3' ) (SEQ ID NO:240) and 5'- 
GG ATCCGCGGCCGC AAGCTTCCTGC AGG-3 ' ) (SEQ ID NO:241) into Sall/SacI- 
digested pCGN7787. A fragment containing the d35S promoter, polylinker and tml 3' region 
was removed from pCGN8621 by complete digestion with Asp718I and partial digestion with 

10 Notl. The fragment was blunt-ended by filling in the 5' overhangs withKlenow fragment 
then ligated into pCGN5139 that had been digested with Asp718I and Hindlll and blunt- 
ended by filling in the 5' overhangs with Klenow fragment. A plasmid containing the insert 
oriented so that the d35S promoter was closest to the blunted Asp718I site of pCGN5139 and 
the tml 3' was closest to the blunted Hindlll site was subjected to sequence analysis to 

15 confirm both the insert orientation and the integrity of cloning junctions. The resulting 
plasmid was designated pCGN8625. 

The coding regions of the various acyltransferase sequences were cloned as 
NotVSse&3S7I fragments into pCGN8622, pCGN8623, pCGN8624, and pCGN8625, for 
expression in sense or antisense orientations from a tissue preferential promoter, napin, or the 

2 0 35S promoter. Fragments which were cloned into the pCGN8622 vector created the 

constructs pCGN8901 (ATAT1), pCGN8571 (ATAT2), pCGN8909 (ATAT3), pCGN8596 
(ATAT4), pCGN8919 (ATAT6), pCGN8914 (ATAT7), pCGN8905 (ATAT8), pCGN9973 
(ATAT9), pCGN9942 (ATAT10), pCGN8575 (AT ATI 1), and pCGN8633 (ATLPAAT1) for 
the sense expression of the respective coding sequences from the napin promoter. Fragments 

2 5 which were cloned into the pCGN8623 vector created the constructs pCGN8900 (ATAT1), 

pCGN8572 (ATAT2), pCGN8910 (ATAT3), pCGN8597 (ATAT4), pCGN8920 (ATAT6), 
pCGN8915 (ATAT7), pCGN8906 (ATAT8), pCGN9972 (ATAT9), pCGN9943 (ATAT10), 
pCGN8576 (ATAT11), and pCGN8634 (ATLPAAT1) for the antisense expression of the 
respective coding sequences from the napin promoter. Fragments which were cloned into the 

3 0 pCGN8624 vector created the constructs pCGN8903 (AT ATI), pCGN8573 (ATAT2), 

pCGN8911 (ATAT3), pCGN8598 (ATAT4), pCGN8921 (ATAT6), pCGN8916 (ATAT7), 
pCGN8907 (ATAT8), pCGN9971 (ATAT9), pCGN9944 (AT AT 10), pCGN8577 (ATAT11), 
and pCGN8635 (ATLPAAT1) for the sense expression of the respective coding sequences 
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from the 35S promoter. Fragments which were cloned into the pCGN8625 vector created the 
constructs pCGN8902 (ATAT1) and pCGN9974 (ATAT9) for the antisense expression of 
the respective coding sequences from the 35S promoter. 

In addition, the yeast acyltransferase coding sequences were cloned into the vector 
5 pCGN8624 creating the constructs pCGN9926 (YSCAT1), pCGN9927 (YSCAT2), 
pCGN9928 (YSCAT3), pCGN9929 (YSCAT4), pCGN9930 (YSCAT5), pCGN9931 
(YSCAT6), pCGN9932 (YSCAT7), and pCGN9933 (YSCAT8). These constructs allow for 
the sense expression of the respective acyltransferase coding sequences from the 35S 
promoter in plant cells. 

10 

Example 8: Plant Transformation 

A variety of methods have been developed to insert a DNA sequence of interest into the 
15 genome of a plant host to obtain the transcription or transcription and translation of the sequence 
to effect phenotypic changes. 

Transgenic Brassica plants are obtained by Agrobacterium-mzdialtd transformation 
as described by Radke et al. (Theor. Appl. Genet. (1988) 75:685-694; Plant Cell Reports 
(1992) 77:499-505). Transgenic Arabidopsis thaliana plants may be obtained by 
2 0 Agrobacterium-mcdidicd transformation as described by Valverkens et al., (Proc. Nat. Acad. 
Sci. (1988) 55:5536-5540), or as described by Bent et al. ((1994), Science 265:1856-1860), or 
Bechtold et al. ((1993), CR.Acad.Sci, Life Sciences 316:1 194-1 199) or Clough, et al. (1998) 
Plant 7., 16:735-43. Other plant species may be similarly transformed using related 
techniques. 

2 5 Alternatively, microprojectile bombardment methods, such as described by Klein et 

al. (Bio/Technology 70:286-291) may also be used to obtain nuclear transformed plants. 

The above results demonstrate that the nucleic acid sequences identified encode 
proteins which are related to protein sequences encoding acyltransferase proteins. Such 

3 0 acyltransferase sequences find use in preparing expression constructs for plant 

transformations. 

All publications and patent applications mentioned in this specification are indicative 
of the level of skill of those skilled in the art to which this invention pertains. All 



WO 00/18889 38 PCMJS99/22231 

publications and patent applications are herein incorporated by reference to the same extent as 
if each individual publication or patent application was specifically and individually indicated 
to be incorporated by reference. 

Although the foregoing invention has been described in some detail by way of 
illustration and example for purposes of clarity of understanding, it will be obvious that 
certain changes and modifications may be practiced within the scope of the appended claim. 
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Claims 

What is Claimed is: 

1 . An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 
5 proteins, 

wherein said enzyme includes the amino acid sequence of SEQ ID NO: 127 
(VxNHxS) wherein the H is the conserved Histidine residue in the conserved peptide 
sequence HXXXXD of said acyltransferase-like protein, x representing any amino acid. 

10 2. An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 

proteins, 

wherein said enzyme includes the amino acid sequence of SEQ ID NO: 128 
(VTYSxS) within about 30 amino acids downstream from the conserved amino acid sequence 
HXXXXD of said acyltransferase-like protein, x representing any amino acid. 

15 

3. An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 
proteins, 

wherein said enzyme includes the amino acid sequence of SEQ ID NO: 129 
(VxLTRxR) within about 60 amino acids downstream from the conserved amino acid 
2 0 sequence HXXXXD of said acyltransferase-like protein, x representing any amino acid. 

4. An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 
proteins, 

wherein said enzyme includes the amino acid sequence of SEQ ID NO: 132 

2 5 (LxxGDLV) within about 20 amino acids upstream of the conserved amino acid sequence 

PEG of said acyltransferase-like protein, x representing any amino acid. 

5. An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 
proteins, 

3 0 wherein said enzyme includes the amino acid sequence of SEQ ID NO: 130 (CPEGT) 

containing the conserved amino acid sequence PEG of said acyltransferase-like protein. 
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6. An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 
proteins, 

wherein said enzyme includes the amino acid sequence of SEQ ID NO: 133 
(FxxGAF) within about 20 amino acids downstream from the conserved amino acid sequence 
5 PEG of said acyltransferase-like protein, x representing any amino acid. 

7. An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 
proteins, 

wherein said enzyme includes the amino acid sequence of SEQ ID NO: 131 (IVPVA) 
10 within about 40 amino acids downstream from the conserved amino acid sequence PEG of 
said acyltransferase-like protein. 

8. An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 
proteins, 

15 wherein said enzyme includes the amino acid sequence of SEQ ID NO: 134 

(VANxxQ) within about 1 10 amino acids downstream from the conserved amino acid 
sequence PEG of said acyltransferase-like protein, x representing any amino acid. 



9. A DNA sequence encoding an enzyme of the class of acyltransferase-like proteins, 
said DNA sequence obtainable by the steps comprising: 

( a ) using the profile of Figure 1 to search a nucleic acid sequence database; 

(b) obtaining a probability score for nucleic acid sequences in said sequence 
database using the Smith- Waterman algorithm; and 

( c ) selecting a nucleic acid sequence having a probability score of less than about 1. 

10. The DNA encoding sequence according to Claim 9, wherein said DNA sequence 
is an encoding sequence. 

30 11. The DNA encoding sequence according to Claim 9, wherein said DNA sequence 

is an EST. 
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12. The DNA encoding sequence according to any one of Claims 1 to 1 1, wherein 
said acyltransferase-like protein is from a plant 



13. A construct comprising a DNA sequence of any one of Claims 1 to 1 1 linked to a 
5 heterologous transcriptional and translational initiation region functional in a host cell. 

14. The construct according to Claim 13 wherein said host cell is a plant cell. 

15. A plant cell comprising a DNA construct according to Claim 13. 

10 

16. A plant comprising a cell according to Claim 15. 

17. The DNA encoding sequence of any one of 1 to 1 1 wherein said acyltransferase- 
1 5 like protein is from Arabidopsis thaliana. 

18. The DNA encoding sequence of any one of 1 to 1 1 wherein said acyltransferase- 
like protein is from corn. 

20 19 . The DNA encoding sequence of Claim 18 wherein said sequence comprises and 

EST selected from the group consisting of SEQ ID NO: 86 through SEQ ID NO: 126. 

2 0 . The DNA encoding sequence of any one of 1 to 1 1 wherein said acyltransferase- 
like protein is from soybean. 

25 

2 1 . The DNA encoding sequence of Claim 20 wherein said sequence comprises and 
EST selected from the group consisting of SEQ ID NO: 24 through SEQ ID NO: 85. 

22 . The DNA encoding sequence of any one of Claims 2, 3, 4, 5, 7 and 8 wherein 
3 0 said acyltransferase-like protein is selected from the group consisting of SEQ ID NO: 1, SEQ 
ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14 and SEQ ID NO: Id 
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23 . The DNA encoding sequence of either of Claim 1 and Claim 6 wherein said 
acyltransferase-like protein is selected from the group consisting of SEQ ID NO: 3, SEQ ID 
NO: 5, SEQ ID NO: 7 and SEQ ID NO: 18. 
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1 

SEQUENCE LISTING 

<110> Lassner, Michael w 
Emig, Robin A 
Ruezinsky, Diane 
Van Eenennaam, Alison 

<120> Novel Plant Acyltransf erases 

<130> 17029/00/WO 

<140> 
<141> 

<150> 60/101,939 
<151> 1998-09-25 

<160> 241 

<170> Patentln Ver. 2 

<210> 1 
<211> 869 
<212> DNA 

<213> Arabidopsis sp. 
<400> 1 

atggttcatg cgaccaagtc 

gtcttccatg atgggcgttt 

ctatggcttc cttttggttt 

ctgaaagatt tgtccgttac 

atcgtcctcc acctccttcc 

ccgcgcttga tcccatcatc 

acagtgtctc tcgtctctcc 

accgtgccac cgatgctgcc 

gtccggaagg cacgacgtgt 

agctaagcga ccggattgtg 

ccacagttag gggtgtgaag 

gctatgaagc cactttcttg 

agactcctat agaggtggct 

aatgcaccga acttactcgc 

tggagtctat caacaacacc 

<210> 2 

<211> 289 

<212> PRT 

<213> Arabidopsis sp. 

<400> 2 

Met Val His Ala Thr Lys Ser Ala Thr Thr He Pro Lys Glu Arg Leu 
1 5 10 15 

Lys Asn Arg He Val Phe His Asp Gly Arg Leu Ala Gin Arg Pro Thr 
20 25 30 

Pro Leu Asn Ala He He Thr Tyr Leu Trp Leu Pro Phe Gly Phe He 
35 40 45 

Leu Ser He He Arg Val Tyr Phe Asn Leu Pro Leu Pro Glu Arg Phe 
50 55 60 

Val Arg Tyr Thr Tyr Glu Met Leu Gly He His Leu Thr He Arg Gly 
65 70 75 80 

His Arg Pro Pro Pro Pro Ser Pro Gly Thr Leu Gly Asn Leu Tyr Val 
85 90 95 



agccacaacg attccaaaag 
agcgcaacgt ccaactccgt 
catctctcca tcattcgcgt 
acttacgaga tgctcgggat 
cctggaactc ttggcaacct 
gtcgctattg ctcttggacg 
cttatgcttt ctcctattcc 
aacatgagaa aacttctcga 
agagaagagt atctactgag 
ccagtagcga tgaactgtaa 
ttttgggacc cttacttctt 
gatcgtttgc ctgaagaaat 
aattacgtcc agaaagttat 
aaggataaat atcttttgct 
aagaagtga 



aacgcttaaa gaaccgcata 60 
taaacgccat tatcacatac 120 
ctacttcaac ctccctttac 180 
ccacttaacc attcgtggtc 240 
ctatgtcctt aaccaccgta 300 
taagatctgt tgcgtcactt 360 
tgctgttgcc ctcacccgtg 420 
gaaaggcgac ttggtgatat 480 
atttagcgct ctattcgcag 540 
acaaggaatg ttcaacggga 600 
cttcatgaac ccaagaccaa 660 
gactgtcaac ggtggtggca 720 
cggcgcggtt ttgggcttcg 780 
tggaggtaat gacggcaagg 840 

869 



Leu Asn His Arg Thr Ala Leu Asp Pro He He Val Ala He Ala Leu 
100 105 110 
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2 
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SEQUENCE LISTING 



<110> Lassner, Michael W 
Emig, Robin A 
Ruezinsky, Diane 
Van Eenennaam, Alison 

<120> Novel Plant Acyl transferases 

<130> 17029/00/WO 

<140> 
<141> 



<150> 60/101,939 
<151> 1998-09-25 

<160> 241 

<170> Patentln Ver. 2.0 

<210> 1 
<211> 869 
<212> DNA 

<213> Arabidopsis sp. 



<400> 1 

atggttcatg 

gtcttccatg 

ctatggcttc 

ctgaaagatt 

atcgtcctcc 

ccgcgcttga 

acagtgtctc 

accgtgccac 

gtccggaagg 

agctaagcga 

ccacagttag 

gctatgaagc 

agactcctat 

aatgcaccga 

tggagtctat 



cgaccaagtc 
atgggcgttt 
cttttggttt 
tgtccgttac 
acctccttcc 
tcccatcatc 
tcgtctctcc 
cgatgctgcc 
cacgacgtgt 
ccggattgtg 
gggtgtgaag 
cactttcttg 
agaggtggct 
acttactcgc 
caacaacacc 



agccacaacg 
agcgcaacgt 
catctctcca 
acttacgaga 
cctggaactc 
gtcgctattg 
cttatgcttt 
aacatgagaa 
agagaagagt 
ccagtagcga 
ttttgggacc 
gatcgtttgc 
aattacgtcc 
aaggataaat 
aagaagtga 



attccaaaag 
ccaactccgt 
tcattcgcgt 
tgctcgggat 
ttggcaacct 
ctcttggacg 
ctcctattcc 
aacttctcga 
atctactgag 
tgaactgtaa 
cttacttctt 
ctgaagaaat 
agaaagttat 
atcttttgct 



aacgcttaaa 
taaacgccat 
ctacttcaac 
ccacttaacc 
ctatgtcctt 
taagatctgt 
tgctgttgcc 
gaaaggcgac 
atttagcgct 
acaaggaatg 
cttcatgaac 
gactgtcaac 
cggcgcggtt 
tggaggtaat 



gaaccgcata 60 
tatcacatac 120 
ctccctttac 180 
attcgtggtc 240 
aaccaccgta 300 
tgcgtcactt 360 
ctcacccgtg 420 
ttggtgatat 480 
ctattcgcag 540 
ttcaacggga 600 
ccaagaccaa 660 
ggtggtggca 720 
ttgggcttcg 780 
gacggcaagg 840 
869 



<210> 2 
<211> 289 
<212> PRT 

<213> Arabidopsis sp. 
<400> 2 

Met Val His Ala Thr Lys Ser Ala Thr Thr He Pro Lys Glu Arg Leu 
15 10 15 

Lys Asn Arg He Val Phe His Asp Gly Arg Leu Ala Gin Arg Pro Thr 
20 25 30 

Pro Leu Asn Ala He He Thr Tyr Leu Trp Leu Pro Phe Gly Phe He 
35 40 45 

Leu Ser He He Arg Val Tyr Phe Asn Leu Pro Leu Pro Glu Arg Phe 
50 55 60 

Val Arg Tyr Thr Tyr Glu Met Leu Gly He His Leu Thr He Arg Gly 
65 70 75 80 

His Arg Pro Pro Pro Pro Ser Pro Gly Thr Leu Gly Asn Leu Tyr Val 
85 90 95 

Leu Asn His Arg Thr Ala Leu Asp Pro He He Val Ala He Ala Leu 
100 105 110 



WO 00/18889 



PCT/US99/22231 



Gly Arg Lys lie Cys Cys Val Thr Tyr Ser Val Ser Arg Leu Ser Leu 
115 120 125 

Met Leu Ser Pro lie Pro Ala Val Ala Leu Thr Arg Asp Arg Ala Thr 
130 135 140 

Asp Ala Ala Asn Met Arg Lys Leu Leu Glu Lys Gly Asp Leu Val He 
145 150 155 160 

Cys Pro Glu Gly Thr Thr Cys Arg Glu Glu Tyr Leu Leu Arg Phe Ser 
165 170 175 

Ala Leu Phe Ala Glu Leu Ser Asp Arg He Val Pro Val Ala Met Asn 
180 185 190 

Cys Lys Gin Gly Met Phe Asn Gly Thr Thr Val Arg Gly Val Lys Phe 
195 200 205 

Trp Asp Pro Tyr Phe Phe Phe Met Asn Pro Arg Pro Ser Tyr Glu Ala 
210 215 220 

Thr Phe Leu Asp Arg Leu Pro Glu Glu Met Thr Val Asn Gly Gly Gly 
225 230 235 240 

Lys Thr Pro He Glu Val Ala Asn Tyr Val Gin Lys Val He Gly Ala 
245 250 255 

Val Leu Gly Phe Glu Cys Thr Glu Leu Thr Arg Lys Asp Lys Tyr Leu 
260 265 270 

Leu Leu Gly Gly Asn Asp Gly Lys Val Glu Ser He Asn Asn Thr Lys 
275 280 285 

Lys 



<210> 3 
<211> 939 
<212> DNA 

<213> Arabidopsis sp. 



<400> 3 

atgacgagct 

agacgtactg 

gataagaaat 

tcaggagctg 

ctcagaggga 

atgattattg 

ttcattgcta 

ggtttggaga 

tttctggata 

gggatattcg 

aagcggatgg 

aagggagcat 

tctttcaaga 

acgctaatgg 

aatgtgagag 

gaggccagaa 



ttactacttc 
gcattcaatg 
cacctagatc 
caacccctga 
tattcttttg 
ggcatccgtt 
aactttgggc 
atctgccatc 
tctacacact 
taattcccat 
acccaagaag 
ctgtgttttt 
aaggcgcatt 
gaacaggcaa 
ttatcatcca 
gcaagattgc 



ccttcatgct 
gtctaaccgc 
aagtcaattg 
ctcttctttt 
tgttgttgct 
cgtccttctc 
ttccataagc 
atcagacact 
tcttagtctt 
catcggttgg 
ccaagtggat 
cttcccagaa 
tacagtggct 
aatcatgcca 
taaaccaata 
agaatcaatg 



gtcccgagtg 
tctttaagac 
gcaagagata 
cctgaaccag 
ggcatttcgg 
ttcgatccct 
atttatccgt 
cctgctgtat 
ggaaaaagct 
gccatgtcca 
tgcttaaaac 
ggaacacgga 
gcgaagaccg 
acgggtagtg 
catggaagca 
gatctctaa 



aaaaatttat 
atgatcctta 
tcactgtgag 
agattaagtt 
ctacttttct 
ataggagaaa 
tttacaaaat 
atgtttcaaa 
ttaagttcat 
tgatgggtgt 
gctgcatgga 
gtaaggatgg 
gagttgcagt 
aaggtatact 
aagcggatgt 



gggcgaaaca 60 
cagatttctt 120 
agcagatctt 180 
gagctcaaga 240 
cattgtcctg 300 
attccaccac 360 
caacatcgag 420 
ccaccaaagt 480 
cagcaagaca 540 
cgttcccttg 600 
acttttaaag 660 
tcggttaggt 720 
agttccaata 780 
gaaccatggg 840 
tctttgcaac 900 
939 



<210> 4 
<211> 312 
<212> PRT 

<213> Arabidopsis sp. 



<400> 4 

Met Thr Ser Phe Thr Thr Ser Leu His Ala Val Pro Ser Glu Lys Phe 
15 10 15 



Met Gly Glu Thr Arg Arg Thr Gly He Gin Trp Ser Asn Arg Ser Leu 
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20 25 30 

Arg His Asp Pro Tyr Arg Phe Leu Asp Lys Lys Ser Pro Arg Ser Ser 
35 40 45 

Gin Leu Ala Arg Asp He Thr Val Arg Ala Asp Leu Ser Gly Ala Ala 
50 55 60 

Thr Pro Asp Ser Ser Phe Pro Glu Pro Glu He Lys Leu Ser Ser Arg 
65 70 75 80 

Leu Arg Gly He Phe Phe Cys Val Val Ala Gly He Ser Ala Thr Phe 
85 90 95 

Leu He Val Leu Met He He Gly His Pro Phe Val Leu Leu Phe Asp 
100 105 110 

Pro Tyr Arg Arg Lys Phe His His Phe He Ala Lys Leu Trp Ala Ser 
115 120 125 

He Ser He Tyr Pro Phe Tyr Lys lie Asn He Glu Gly Leu Glu Asn 
130 135 140 

Leu Pro Ser Ser Asp Thr Pro Ala Val Tyr Val Ser Asn His Gin Ser 
145 150 155 • 160 

Phe Leu Asp He Tyr Thr Leu Leu Ser Leu Gly Lys Ser Phe Lys Phe 
165 170 175 

He Ser Lys Thr Gly He Phe Val He Pro He He Gly Trp Ala Met 
180 185 190 

Ser Met Met Gly Val Val Pro Leu Lys Arg Met Asp Pro Arg Ser Gin 
195 200 205 

Val Asp Cys Leu Lys Arg Cys Met Glu *Leu Leu Lys Lys Gly Ala Ser 
210 215 220 

Val Phe Phe Phe Pro Glu Gly Thr Arg Ser Lys Asp Gly Arg Leu Gly 
225 230 235 240 

Ser Phe Lys Lys Gly Ala Phe Thr Val Ala Ala Lys Thr Gly Val Ala 
245 -250 255 

Val Val Pro He Thr Leu Met Gly Thr Gly Lys He Met Pro Thr Gly 
260 265 270 

Ser Glu Gly He Leu Asn His Gly Asn Val Arg Val He He His Lys 
275 280 285 

Pro He His Gly Ser Lys Ala Asp Val Leu Cys Asn Glu Ala Arg Ser 
290 295 300 

Lys lie Ala Glu Ser Met Asp Leu 
305 310 

<210> 5 
<211> 1197 
<212> DNA 

<213> Arabidopsis sp. 
<400> 5 

atggaatcag agctcaaaga tttgaattcg aattcgaatc ctccgtcgag caaagaggac 60 
cggccgttac tgaaatcaga atccgatttg gcggctgcca ttgaagagtt agacaaaaag 120 
ttcgcacctt acgcgaggac cgatttgtat gggacgatgg gtttgggtcc tttcccgatg 180 
acggagaata ttaaattggc ggttgcattg gtgactcttg ttccattgcg gtttcttctc 240 
tcgatgagca tcttgcttct ctattacttg atttgtaggg tatttacgct gttttctgct 300 
ccttatcgtg ggccagagga agaggaagat gaaggtggag ttgtttttca ggaagattat 360 
gctcacatgg aaggttggaa acggactgtt atcgtccggt ctgggaggtt tctctctagg 420 
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gttttgcttt tcgtttttgg gttttattgg attcacgaga gctgtccaga tcgagattca 480 

gacatggatt ctaatcctaa aactacttct acagagatta accagaaagg ggaagccgcc 540 

acggaggaac ctgaaagacc tggagccatt gtgtccaatc atgtttcgta cttggacatt 600 

ttgtatcata tgtctgcttc ttttccaagt tttgttgcca agagatcagt gggcaaactt 660 

cctcttgttg gcctcattag caaatgcctt ggttgtgtct atgttcaaag agaagcaaaa 720 

tcgcctgatt tcaagggtgt atctggcaca gtaaatgaaa gagttcgaga agctcatagc 780 

aataaatctg ctccaactat tatgcttttt ccagaaggaa caactaccaa tggagactac 840 

ttacttacat tcaagacagg tgcatttttg gctggaactc cagttcttcc ggtaatatta 900 

aaatatccgt atgagcgctt cagtgtggca tgggatacca tatccggggc acgccacatt 960 

ttattccttc tctgtcaagt cgtaaatcac ttggaagtca tacggttacc tgtatactac 
1020 

ccatcccaag aagagaaaga cgatcccaaa ctttatgcta gcaatgttcg gaaattaatg 
1080 

gccaccgagg gtaacttgat tctatcggag ttgggactta gcgacaaaag gatatatcac 
1140 

gcaactctca atggtaatct tagtcaaacc cgtgatttcc atcagaaaga agaatga 
1197 

<210> 6 
<211> 398 
<212> PRT 

<213> Arabidopsis sp. 
<400> 6 

Met Glu Ser Glu Leu Lys Asp Leu Asn Ser Asn Ser Asn Pro Pro Ser 
1 5 10 15 

Ser Lys Glu Asp Arg Pro Leu Leu Lys Ser Glu Ser Asp Leu Ala Ala 
20 " 25 30 

Ala He Glu Glu Leu Asp Lys Lys Phe Ala Pro Tyr Ala Arg Thr Asp 
35 40 45 

Leu Tyr Gly Thr Met Gly Leu Gly Pro Phe Pro Met Thr Glu Asn He 
50 55 60 

Lys Leu Ala Val Ala Leu Val Thr Leu Val Pro Leu Arg Phe Leu Leu 
65 70 75 80 

Ser Met Ser He Leu Leu Leu Tyr Tyr Leu He Cys Arg Val Phe Thr 
85 90 95 

Leu Phe Ser Ala Pro Tyr Arg Gly Pro Glu Glu Glu Glu Asp Glu Gly 
100 105 110 

Gly Val Val Phe Gin Glu Asp Tyr Ala His Met Glu Gly Trp Lys Arg 
115 120 125 

Thr Val He Val Arg Ser Gly Arg Phe Leu Ser Arg Val Leu Leu Phe 
130 135 140 

Val Phe Gly Phe Tyr Trp He His Glu Ser Cys Pro Asp Arg Asp Ser 
145 150 155 160 

Asp Met Asp Ser Asn Pro Lys Thr Thr Ser Thr Glu He Asn Gin Lys 
165 170 175 

Gly Glu Ala Ala Thr Glu Glu Pro Glu Arg Pro Gly Ala He Val Ser 
180 185 190 

Asn His Val Ser Tyr Leu Asp He Leu Tyr His Met Ser Ala Ser Phe 
195 200 205 

Pro Ser Phe Val Ala Lys Arg Ser Val Gly Lys Leu Pro Leu Val Gly 
210 215 220 

Leu He Ser Lys Cys Leu Gly Cys Val Tyr Val Gin Arg Glu Ala Lys 
225 230 235 240 

Ser Pro Asp Phe Lys Gly Val Ser Gly Thr Val Asn Glu Arg Val Arg 
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o 

245 250 255 



Glu Ala His Ser Asn Lys Ser Ala Pro Thr He Met Leu Phe Pro Glu 
260 265 270 

Gly Thr Thr Thr Asn Gly Asp Tyr Leu Leu Thr Phe Lys Thr Gly Ala 
275 280 285 

Phe Leu Ala Gly Thr Pro Val Leu Pro Val He Leu Lys Tyr Pro Tyr 
290 295 300 

Glu Arg Phe Ser Val Ala Trp Asp Thr He Ser Gly Ala Arg His He 
305 310 315 320 

Leu Phe Leu Leu Cys Gin Val Val Asn His Leu Glu Val He Arg Leu 
325 330 335 

Pro Val Tyr Tyr Pro Ser Gin Glu Glu Lys Asp Asp Pro Lys Leu Tyr 
340 345 350 

Ala Ser Asn Val Arg Lys Leu Met Ala Thr Glu Gly Asn Leu He Leu 
355 360 365 



Ser Glu Leu Gly Leu Ser Asp Lys Arg He Tyr His Ala Thr Leu Asn 
370 375 380 

Gly Asn Leu Ser Gin Thr Arg Asp Phe His Gin Lys Glu Glu 
385 390 395 



<210> 7 
<211> 1131 
<212> DNA 

<213> Arabidopsis sp. 
<400> 7 

atgagcagta cggcagggag gctcgtgact tcaaaatccg agcttgacct cgatcaccct 60 

aacatcgaag attaccttcc ttctggttct tccatcaatg aacctcgcgg caagctcagc 120 

ctgcgtgatt tgctagacat ctctccaacg ctcactgaag ctgctggtgc cattgttgat 180 

gactcgttca caagatgttt caaatcaaat cctccagaac cttggaactg gaatatttac 240 

ttattcccac tatactgctt tggggttgtt gttagatact gtatcctctt tcccttgagg 300 

tgcttcactt tagcttttgg gtggattatt ttcctttcat tgtttatccc tgtaaatgcg 360 

ttgctgaaag gtcaagatag gttgaggaaa aagatagaga gggtcttggt ggaaatgatt 420 

tgcagctttt ttgtcgcctc atggaccgga gttgtcaaat atcacgggcc acgtcctagc 480 

atccgtccta agcaggtcta tgttgccaac catacttcaa tgattgattt catcgtattg 540 

gagcagatga ccgcatttgc tgttataatg cagaagcatc ctggttgggt tggtcttctg 600 

caaagcacaa tattagagag tgtgggatgt atctggttca atcgttcaga ggcaaaggat 660 

cgtgaaattg tagcaaaaaa gttaagggac catgtccaag gagctgacag taatcctctt 720 

ctcatatttc ccgaagggac atgtgtaaat aataattaca cagtgatgtt taagaagggt 780 

gcttttgaat tggactgcac tgtttgtcca attgcaatta aatacaacaa gatttttgtt 840 

gacgccttct ggaatagcag aaaacaatca tttactatgc acttgctgca actcatgaca 900 

tcatgggctg ttgtatgtga agtgtggtac ttggaaccac aaaccataag gcccggtgaa 960 

acaggaattg aatttgcaga gagggtcaga gacatgatat ctcttcgggc gggtctcaaa 
1020 

aaggtccctt gggatggata cttgaagtat tcgagaccaa gccccaagca tagtgaacgc 
1080 

aagcaacaga gtttcgcaga gtcgatcctg gctagattgg aagagaagtg a 
1131 

<210> 8 
<211> 376 
<212> PRT 

<213> Arabidopsis sp. 
<400> 8 

Met Ser Ser Thr Ala Gly Arg Leu Val Thr Ser Lys Ser Glu Leu Asp 
15 10 15 



Leu Asp His Pro Asn He Glu Asp Tyr Leu Pro Ser Gly Ser Ser He 
20 25 30 
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Asn Glu Pro Arg Gly Lys Leu Ser Leu Arg Asp Leu Leu Asp lie Ser 
35 40 45 

Pro Thr Leu Thr Glu Ala Ala Gly Ala He Val Asp Asp Ser Phe Thr 
50 55 60 

Arg Cys Phe Lys Ser Asn Pro Pro Glu Pro Trp Asn Trp Asn He Tyr 
65 70 75 80 

Leu Phe Pro Leu Tyr Cys Phe Gly Val Val Val Arg Tyr Cys He Leu 
85 90 95 

Phe Pro Leu Arg Cys Phe Thr Leu Ala Phe Gly Trp He He Phe Leu 
100 105 110 

Ser Leu Phe He Pro Val Asn Ala Leu Leu Lys Gly Gin Asp Arg Leu 
115 120 125 

Arg Lys Lys He Glu Arg Val Leu Val Glu Met He Cys Ser Phe Phe 
130 135 140 

Val Ala Ser Trp Thr Gly Val Val Lys Tyr His Gly Pro Arg Pro Ser 
145 150 155 160 

He Arg Pro Lys Gin Val Tyr Val Ala Asn His Thr Ser Met He Asp 
165 170 175 

Phe He Val Leu Glu Gin Met Thr Ala Phe Ala Val He Met Gin Lys 
180 185 190 

His Pro Gly Trp Val Gly Leu Leu Gin Ser Thr He Leu Glu Ser Val 
195 200 205 

Gly Cys He Trp Phe Asn Arg Ser Glu Ala Lys Asp Arg Glu He Val 
210 215 220 

Ala Lys Lys Leu Arg Asp His Val Gin Gly Ala Asp Ser Asn Pro Leu 
225 230 235 240 

Leu He Phe Pro Glu Gly Thr Cys Val Asn Asn Asn Tyr Thr Val Met 
245 250 255 

Phe Lys Lys Gly Ala Phe Glu Leu Asp Cys Thr Val Cys Pro He Ala 
260 265 270 

He Lys Tyr Asn Lys He Phe Val Asp Ala Phe Trp Asn Ser Arg Lys 
275 280 285 

Gin Ser Phe Thr Met His Leu Leu Gin Leu Met Thr Ser Trp Ala Val 
290 295 300 

Val Cys Glu Val Trp Tyr Leu Glu Pro Gin Thr He Arg Pro Gly Glu 
305 310 315 320 

Thr Gly He Glu Phe Ala Glu Arg Val Arg Asp Met He Ser Leu Arg 
325 330 335 

Ala Gly Leu Lys Lys Val Pro Trp Asp Gly Tyr Leu Lys Tyr Ser Arg 
340 345 350 

Pro Ser Pro Lys His Ser Glu Arg Lys Gin Gin Ser Phe Ala Glu Ser 
355 360 365 



He Leu Ala Arg Leu Glu Glu Lys 
370 375 



<210> 9 
<211> 965 
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<212> DNA 

<213> Arabidopsis sp. 
<400> 9 

gttgttaagt tacaagtctc ttcaaaaaca 

tcgatcacag ctcgattttc ctttattgtt 

tgggatcatc aaactngtcg gtaaggwaac 

tctaatggta ccgtcgtgat cgcaaccgcc 

gccatggctc gtcaattcca tggaaatcat 

cgacccattc tccgttcttg tctatcttca 

aagaaagtgc ggttcgcgga taatgtgaaa 

aggagggaat tgaaccggaa aagcgtaccg 

tctatgtgta gaatctctac catgccagcg 

agagaccgag atcacagagt tcaatattct 

gatttaggtt ttgtaaatct ttcttttgtt 

tttcagatat tgtagacttt gtagttgggt 

tagtagtagg tggttttctt atgctccact 

gatgtaaata attgacatgt aagtagtcat 

taaatttgta aaaacatagt gtgcctattg 

ctatggaatt tatattgatt gtgttgaaaa 
aaaaa 



cacacacacg tctctcttca cagccaatca 60 
ccgttggttt tcttgagnat ttttctttct 120 
ttcacggacg gatcttcaat gttgagctgt 180 
atggtttgct caagcaccgc tctgtttctc 240 
caaaatccta aggttcttga tcagactcta 300 
gaggaaacga agaaacaggg gaagaagata 360 
gatacgaaag gtaacgggga agagtaccgg 420 
aagccagtga ctaaaccggg aaagaccggt 480 
aaccggatgg ctctgtacaa tgggattctt 540 
tattgacttt ttcttcttga ttagtcaata 600 
tttcggtaat attagatttt ttcttggaaa 660 
ggtcttcttt ttctcccttt ttgtgtctca 720 
tatctactta cttgttttaa atcaagtgat 780 
tagaaatttg aaaaggcaaa tgaaagaata 840 
tacatataaa ctctcttttg ttggggatat 900 
aacaaaaaaa aaaaaaaaaa aaaaaaaaaa 960 

965 



<210> 10 
<211> 1593 
<212> DNA 

<213> Arabidopsis sp. 



<400> 10 

atgtccggta ataagatctc gactcttcaa gctcttgtct tcttcttgta ccggtttttc 60 

attctccgtc gttggtgtca tcgtagccct aaacaaaaat accaaaaatg cccttctcac 120 

ggcctccacc aatatcaaga cctatcgaat cacactttga tattcaacgt cgaaggagct 180 

ctactcaaat caaactcttt attcccttac ttcatggttg tggcattcga agccggaggg 240 

gtgataaggt cacttttcct cttagttctt tatccattta taagcttgat gagctacgaa 300 

atgggcttga agacgatggt gatgctgagc ttctttggag ttaaaaagga aagcttccga 360 

gtggggaaat cagttttgcc taagtatttt ctagaagatg ttgggctcga gatgttccag 420 

gttttgaaaa gaggaggcaa gagagttgct gtgagtgatt taccacaagt tatgattgat 480 

gtattcttgc gagattactt ggagatagaa gttgtggtcg gaagagacat gaaaatggtc 540 

ggtggttact acctaggcat cgtggaggat aagaagaacc ttgaaattgc ttttgataaa 600 

gtggttcaag aagaaagact tggtagtggt cgtcgtctta ttggcatcac ttcctttaac 660 

tcgccaagtc acagatctct cttctctcaa ttttgccagg aaatttactt cgtcagaaat 720 

tcagacaaga aaagttggca aaccctacca caagatcaat accctaaacc attgattttc 780 

cacgatggtc gtttagccgt taagccaaca cctttaaaca cactcgtatt attcatgtgg 840 

gccccattcg ccgccgtctt agccgctgca agactcgtct tcggcctaaa cttaccttac 900 

tccctagcca atcccttcct cgccttttcc ggtatccacc ttactctcac cgtcaacaac 960 

cacaacgacc taatatccgc cgacagaaaa agaggttgtc tctttgtgtg taaccataga 
1020 

acgttattgg acccacttta catttcatac gctctaagaa agaaaaacat gaaagccgtg 
1080 

acgtatagtc taagcagatt atctgagctt ctggctccga tcaagaccgt tagattgact 
1140 

cgtgatcgag tcaaagatgg tcaagccatg gagaaattgc tgagccaggg agatctcgtg 
1200 

gtttgtccgg aagggactac gtgtagagag ccttacttgc ttcggtttag tccacttttc 
1260 

tctgaggttt gtgacgtcat cgtacctgtt gctattgact cacacgtgac tttcttctat 
1320 

ggcacgacgg ctagtggtct taaggcattt gatcccattt tcttcctttt gaatcctttc 
1380 

ccttcctaca ccgtcaaatt gcttgaccct gtctctggaa gtagctcgtc cacgtgtcga 
1440 

ggagtccctg acaatggaaa agttaacttc gaggtggcta atcacgtgca gcatgagatc 
1500 

gggaatgcct tggggtttga gtgcaccaac ctcacgagaa gagataagta cttgatcttg 
1560 

gccggtaata acggagttgt caagaaaaaa taa 
1593 



<210> 11 
<211> 530 
<212> PRT 
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<213> Arabidopsis sp. 
<400> 11 

Met Ser Gly Asn Lys lie Ser Thr Leu Gin Ala Leu Val Phe Phe Leu 
15 10 15 

Tyr Arg Phe Phe lie Leu Arg Arg Trp Cys His Arg Ser Pro Lys Gin 
20 25 30 

Lys Tyr Gin Lys Cys Pro Ser His Gly Leu His Gin Tyr Gin Asp Leu 
35 40 45 

Ser Asn His Thr Leu lie Phe Asn Val Glu Gly Ala Leu Leu Lys Ser 
50 55 60 

Asn Ser Leu Phe Pro Tyr Phe Met Val Val Ala Phe Glu Ala Gly Gly 
65 70 75 80 

Val He Arg Ser Leu Phe Leu Leu Val Leu Tyr Pro Phe He Ser Leu 
85 90 95 

Met Ser Tyr Glu Met Gly Leu Lys Thr Met Val Met Leu Ser Phe Phe 
100 105 110 

Gly Val Lys Lys Glu Ser Phe Arg Val Gly Lys Ser Val Leu Pro Lys 
115 120 125 

Tyr Phe Leu Glu Asp Val Gly Leu Glu Met Phe Gin Val Leu Lys Arg 
130 135 140 

Gly Gly Lys Arg Val Ala Val Ser Asp Leu Pro Gin Val Met He Asp 
145 150 155 160 

Val Phe Leu Arg Asp Tyr Leu Glu He Glu Val Val Val Gly Arg Asp 
165 170 175 

Met Lys Met Val Gly Gly Tyr Tyr Leu Gly He Val Glu Asp Lys Lys 
180 185 190 

Asn Leu Glu He Ala Phe Asp Lys Val Val Gin Glu Glu Arg Leu Gly 
195 200 205 

Ser Gly Arg Arg Leu He Gly He Thr Ser Phe Asn Ser Pro Ser His 
210 215 220 

Arg Ser Leu Phe Ser Gin Phe Cys Gin Glu He Tyr Phe Val Arg Asn 
225 230 235 240 

Ser Asp Lys Lys Ser Trp Gin Thr Leu Pro Gin Asp Gin Tyr Pro Lys 
245 250 255 

Pro Leu He Phe His Asp Gly Arg Leu Ala Val Lys Pro Thr Pro Leu 
260 265 270 

Asn Thr Leu Val Leu Phe Met Trp Ala Pro Phe Ala Ala Val Leu Ala 
275 280 285 

Ala Ala Arg Leu Val Phe Gly Leu Asn Leu Pro Tyr Ser Leu Ala Asn 
290 295 300 

Pro Phe Leu Ala Phe Ser Gly He His Leu Thr Leu Thr Val Asn Asn 
305 310 315 320 

His Asn Asp Leu He Ser Ala Asp Arg Lys Arg Gly Cys Leu Phe Val 
325 330 335 

Cys Asn His Arg Thr Leu Leu Asp Pro Leu Tyr He Ser Tyr Ala Leu 
340 345 350 

Arg Lys Lys Asn Met Lys Ala Val Thr Tyr Ser Leu Ser Arg Leu Ser 
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355 360 365 

Glu Leu Leu Ala Pro He Lys Thr Val Arg Leu Thr Arg Asp Arg Val 
370 375 380 

Lys Asp Gly Gin Ala Met Glu Lys Leu Leu Ser Gin Gly Asp Leu Val 
385 390 395 400 

Val Cys Pro Glu Gly Thr Thr Cys Arg Glu Pro Tyr Leu Leu Arg Phe 
405 410 415 

Ser Pro Leu Phe Ser Glu Val Cys Asp Val He Val Pro Val Ala He 
420 425 430 

Asp Ser His Val Thr Phe Phe Tyr Gly Thr Thr Ala Ser Gly Leu Lys 
435 440 445 

Ala Phe Asp Pro He Phe Phe Leu Leu Asn Pro Phe Pro Ser Tyr Thr 
450 455 460 

Val Lys Leu Leu Asp Pro Val Ser Gly Ser Ser Ser Ser Thr Cys Arg 
465 470 475 480 

Gly Val Pro Asp Asn Gly Lys Val Asn Phe Glu Val Ala Asn His Val 
485 490 495 

Gin His Glu He Gly Asn Ala Leu Gly Phe Glu Cys Thr Asn Leu Thr 
500 505 510 

Arg Arg Asp Lys Tyr Leu He Leu Ala Gly Asn Asn Gly Val Val Lys 
515 520 525 

Lys Lys 
530 



<210> 12 
<211> 1509 
<212> DNA 

<213> Arabidopsis sp. 
<400> 12 

atggttatgg agcaagctgg aacgacatcg 
atactgaaga acgcagattc attctcttac 
ctaattcgtt tcgctatctt gttgtttcta 
agctacaaaa acgcagctct caagctcaag 
ccggagatcg aatcagtggc tagagccgtt 
atggacacgt ggagggtttt cagctcgtgt 
cgagttatgg tggagaggtt tgctaaggag 
gaactgattg taaaccggtt cggttttgtc 
cagtctgctt tgaaccgtgt cgctaatttg 
ggaaaaccgg ctttgaccgc ctctacaaat 
gcaccaatcc cggagaacta caaccacggt 
gtgatatttc acgacggaag actagtgaag 
ctcctttgga tcccatttgg aatcattctc 
ctcccattgt gggccacacc ttacgtctct 
ggaaagcctc ctcagccacc ggcggctgga 
agaaccctaa tggaccctgt ggtattatct 
acttactcaa tctcgcgctt atcagagatc 
1020 

agaatccgag atgtggatgc ggctaagatc 
1080 

gtttgtcctg agggaaccac ttgtcgtgaa 
1140 

gctgagttaa cggataggat tgttccggtt 
1200 

gcgactacag cgagaggctg gaagggtttg 
1260 

ccggtttacg agattacgtt cttgaaccag 
1320 



tattcggtcg tgtcagagtt tgaaggaaca 60 
ttcatgctcg tagccttcga agcagctggt 120 
tggcccgtaa tcacactcct tgacgttttc 180 
atttttgtag ccactgttgg tctacgtgaa 240 
ctgccaaaat tctacatgga cgacgtaagc 300 
aagaagaggg tcgtggtcac gagaatgcct 360 
catcttagag cagatgaggt catcggtacg 420 
accggtttga ttcgcgaaac ggatgttgat 480 
tttgttggtc ggaggcctca actaggtctt 540 
ttcttatcgt tatgtgagga gcatattcat 600 
gaccaacaac ttcagctacg tccacttccg 660 
cggccaacgc cggccaccgc tctcatcatc 720 
gccgtgatcc ggatctttct tggagccgtc 780 
cagatattcg gtggccatat catcgtcaaa 840 
aaatccggcg tgctctttgt gtgtactcac 900 
tatgtcctcg gacgtagcat cccagccgtt 960 
ttatctccca ttccaaccgt ccgattgaca 

aaacaacaac tgtcaaaagg agatctagtg 

ccgtttttgt taagattcag cgcgcttttc 

gcgatgaact acagagtcgg attcttccac 

gacccaattt tcttcttcat gaacccaaga 

cttcctatgg aggcaacatg ttcgtccggg 
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aagagcccgc atgacgtggc gaactatgtt cagagaatct tggcggctac gttagggttt 
1380 

gagtgcacca acttcacaag aaaagataag tatagggttc tcgctggaaa cgatggaaca 
1440 

gtgtcgtact tgtcgttgct agaccaattg aagaaggtgg ttagcacttt cgagccttgt 
1500 

ctccattga 
1509 

<210> 13 
<211> 502 
<212> PRT 

<213> Arabidopsis sp. 
<400> 13 

Met Val Met Glu Gin Ala Gly Thr Thr Ser Tyr Ser Val Val Ser Glu 
1 5 10 15 

Phe Glu Gly Thr lie Leu Lys Asn Ala Asp Ser Phe Ser Tyr Phe Met 
20 25 30 

Leu Val Ala Phe Glu Ala Ala Gly Leu He Arg Phe Ala He Leu Leu 
35 40 45 

Phe Leu Trp Pro Val He Thr Leu Leu Asp Val Phe Ser Tyr Lys Asn 
50 55 . 60 

Ala Ala Leu Lys Leu Lys He Phe Val Ala Thr Val Gly Leu Arg Glu 
65 70 75 80 

Pro Glu He Glu Ser Val Ala Arg Ala Val Leu Pro Lys Phe Tyr Met 
85 90 95 

Asp Asp Val Ser Met Asp Thr Trp Arg Val Phe Ser Ser Cys Lys Lys 
100 105 110 

Arg Val Val Val Thr Arg Met Pro Arg Val Met Val Glu Arg Phe Ala 
115 120 125 

Lys Glu His Leu Arg Ala Asp Glu Val He Gly Thr Glu Leu He Val 
130 135 140 

Asn Arg Phe Gly Phe Val Thr Gly Leu He Arg Glu Thr Asp Val Asp 
145 150 155 160 

Gin Ser Ala Leu Asn Arg Val Ala Asn Leu Phe Val Gly Arg Arg Pro 
165 170 175 

Gin Leu Gly Leu Gly Lys Pro Ala Leu Thr Ala Ser Thr Asn Phe Leu 
180 185 190 

Ser Leu Cys Glu Glu His He His Ala Pro He Pro Glu Asn Tyr Asn 
195 200 , 205 

His Gly Asp Gin Gin Leu Gin Leu Arg Pro Leu Pro Val He Phe His 
210 215 220 

Asp Gly Arg Leu Val Lys Arg Pro Thr Pro Ala Thr Ala Leu He He 
225 230 235 240 

Leu Leu Trp He Pro Phe Gly He He Leu Ala Val He Arg He Phe 
245 250 255 

Leu Gly Ala Val Leu Pro Leu Trp Ala Thr Pro Tyr Val Ser Gin He 
260 265 270 

Phe Gly Gly His He He Val Lys Gly Lys Pro Pro Gin Pro Pro Ala 
275 280 285 

Ala Gly Lys Ser Gly Val Leu Phe Val Cys Thr His Arg Thr Leu Met 
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290 295 300 

Asp Pro Val Val Leu Ser Tyr Val Leu Gly Arg Ser lie Pro Ala Val 
305 310 315 320 

Thr Tyr Ser lie Ser Arg Leu Ser Glu He Leu Ser Pro He Pro Thr 
325 330 335 

Val Arg Leu Thr Arg He Arg Asp Val Asp Ala Ala Lys He Lys Gin 
340 345 350 

Gin Leu Ser Lys Gly Asp Leu Val Val Cys Pro Glu Gly Thr Thr Cys 
355 360 365 

Arg Glu Pro Phe Leu Leu Arg Phe Ser Ala Leu Phe Ala Glu Leu Thr 
370 375 380 

Asp Arg He Val Pro Val Ala Met Asn Tyr Arg Val Gly Phe Phe His 
385 390 395 400 

Ala Thr Thr Ala Arg Gly Trp Lys Gly Leu Asp Pro He Phe Phe Phe 
405 410 415 

Met Asn Pro Arg Pro Val Tyr Glu He Thr Phe Leu Asn Gin Leu Pro 
420 425 430 

Met Glu Ala Thr Cys Ser Ser Gly Lys Ser Pro His Asp Val Ala Asn 
435 440 445 

Tyr Val Gin Arg He Leu Ala Ala Thr Leu Gly Phe Glu Cys Thr Asn 
450 455 460 

Phe Thr Arg Lys Asp Lys Tyr Arg Val Leu Ala Gly Asn Asp Gly Thr 
465 470 475 480 

Val Ser Tyr Leu Ser Leu Leu Asp Gin Leu Lys Lys Val Val Ser Thr 
485 490 495 

Phe Glu Pro Cys Leu His 
500 

<210> 14 
<211> 1563 
<212> DNA 

<213> Arabidopsis sp. 



<400> 14 

atgtccgcca agatttcaat attccaagct 
cggcgatatc ggaactctaa accaaaatac 
gacctatcac gccacacatt gatcttcaac 
ctcttccctt acttcatgtt agtagcattt 
ctcttcattc tctatccatt gataagcttg 
gtaatggtga gcttcttcgg gatcaaaaaa 
cctaaatact ttctagaaga tgtcggactc 
aagaaaatcg gagtgagtga tgatcttcct 
tacttggaga ttgacgttgt ggtcgggaga 
ggtatcatgg aggataaaac caaacatgat 
agactaaaca ccggtcgtgt tattggcatc 
ctattctctc agttttgcca ggaaatttat 
caaaccctac cacgaagcca gtaccctaaa 
atcaaaccaa ccctaatgaa cactttggtc 
gccgcagcag ccagactctt cgtctctctt 
ctcgcctttt ccggttgcag actaaccgtc 
aaaccaagtc aacgcaaagg ttgtctcttt 
1020 

ctctatgttg cattcgcttt gagaaagaaa 
1080 

agggtatctg agattttggc tccgatcaag 
1140 



cttgtctttc tattctaccg gtttatcctc 60 
caaaatggcc cttcttctct cctccaatcc 120 
gtagaaggag ctcttctcaa atccgactct 180 
gaggcgggag gcgtaataag gtcatttctc 240 
atgagccatg agatgggtgt caaagtgatg 300 
gaaggttttc gagcggggag agcggttttg 360 
gagatcttcg aagtgttgaa gagaggaggg 420 
caagttatga tcgaagggtt cttgagagat 480 
gaaatgaaag tcgttggagg ttattatcta 540 
cttgtctttg atgagttagt tcgtaaagag 600 
acttccttca atacatctct tcaccgatat 660 
ttcgtgaaga aatcagacaa gcgaagctgg 720 
ccattgattt tccatgatgg ccgtctcgcg 780 
ttgttcatgt ggggtccttt cgcagccgca 840 
tgcatccctt actctttatc aatcccgatc 900 
actaacgact acgtttcatc tcaaaaacaa 960 
gtatgtaacc ataggacttt attggaccct 

aacatcaaaa ctgtaacgta tagtttgagt 

acggtgagac tgacccgtga tcgggtgagc 



WO 00/18889 PCT/US99/22231 

gacggtcaag ccatggagaa attgttaacc gaaggagatc tcgttgtttg tcctgaagga 
1200 

accacttgta gagaacctta cctgcttagg tttagccctt tgttcaccga ggttagtgat 
1260 

gtcatcgttc ccgtggctgt gacggtacac gtgaccttct tctacggtac aacggcgagt 
1320 

ggtcttaagg cacttgaccc gcttttcttc ctcttggatc cttatcctac ctacaccatc 
1380 

caatttctcg accctgtctc cggtgccacg tgccaagatc ctgatggaaa gttgaagttt 
1440 

gaggtggcca acaatgttca gagtgatatt gggaaggcgc tggatttcga gtgcacaagt 
1500 

ctcactagaa aagacaagta tttgatcttg gccggtaata atggagtagt taagaaaaat 

1560 

taa 

1563 

<210> 15 
<211> 520 
<212> PRT 

<213> Arabidopsis sp. 
<400> 15 

Met Ser Ala Lys He Ser He Phe Gin Ala Leu Val Phe Leu Phe Tyr 
15 10 15 

Arg Phe He Leu Arg Arg Tyr Arg Asn Ser Lys Pro Lys Tyr Gin Asn 
20 25 30 

Gly Pro Ser Ser Leu Leu Gin Ser Asp Leu Ser Arg His Thr Leu He 
35 40 45 

Phe Asn Val Glu Gly Ala Leu Leu Lys Ser Asp Ser Leu Phe Pro Tyr 
50 55 60 

Phe Met Leu Val Ala Phe Glu Ala Gly Gly Val He Arg Ser Phe Leu 
65 70 75 80 

Leu Phe He Leu Tyr Pro Leu He Ser Leu Met Ser His Glu Met Gly 
85 90 95 

Val Lys Val Met Val Met Val Ser Phe Phe Gly He Lys Lys Glu Gly 
100 105 110 

Phe Arg Ala Gly Arg Ala Val Leu Pro Lys Tyr Phe Leu Glu Asp Val 
115 120 125 

Gly Leu Glu He Phe Glu Val Leu Lys Arg Gly Gly Lys Lys He Gly 
130 135 140 

Val Ser Asp Asp Leu Pro Gin Val Met He Glu Gly Phe Leu Arg Asp 
145 150 155 160 

Tyr Leu Glu He Asp Val Val Val Gly Arg Glu Met Lys Val Val Gly 
165 170 175 

Gly Tyr Tyr Leu Gly lie Met Glu Asp Lys Thr Lys His Asp Leu Val 
180 185 190 

Phe Asp Glu Leu Val Arg Lys Glu Arg Leu Asn Thr Gly Arg Val He 
195 200 205 

Gly He Thr Ser Phe Asn Thr Ser Leu His Arg Tyr Leu Phe Ser Gin 
210 215 220 

Phe Cys Gin Glu He Tyr Phe Val Lys Lys Ser Asp Lys Arg Ser Trp 
225 230 235 240 

Gin Thr Leu Pro Arg Ser Gin Tyr Pro Lys Pro Leu He Phe His Asp 
245 ->50 255 
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Gly Arg Leu Ala lie Lys Pro Thr Leu Met Asn Thr Leu Val Leu Phe 
260 265 270 

Met Trp Gly Pro Phe Ala Ala Ala Ala Ala Ala Ala Arg Leu Phe Val 
275 280 285 

Ser Leu Cys He Pro Tyr Ser Leu Ser He Pro He Leu Ala Phe Ser 
290 295 300 

Gly Cys Arg Leu Thr Val Thr Asn Asp Tyr Val Ser Ser Gin Lys Gin 
305 ~ 310 315 320 

Lys Pro Ser Gin Arg Lys Gly Cys Leu Phe Val Cys Asn His Arg Thr 
325 330 335 

Leu Leu Asp Pro Leu Tyr Val Ala Phe Ala Leu Arg Lys Lys Asn He 
340 345 350 

Lys Thr Val Thr Tyr Ser Leu Ser Arg Val Ser Glu He Leu Ala Pro 
355 ~ 360 365 

He Lys Thr Val Arg Leu Thr Arg Asp Arg Val Ser Asp Gly Gin Ala 
370 375 380 

Met Glu Lys Leu Leu Thr Glu Gly Asp Leu Val Val Cys Pro Glu Gly 
385 390 395 400 

Thr Thr Cys Arg Glu Pro Tyr Leu Leu Arg Phe Ser Pro Leu Phe Thr 
405 410 415 

Glu Val Ser Asp Val He Val Pro Val Ala Val Thr Val His Val Thr 
420 425 430 

Phe Phe Tyr Gly Thr Thr Ala Ser Gly Leu Lys Ala Leu Asp Pro Leu 
435 440 445 

Phe Phe Leu Leu Asp Pro Tyr Pro Thr Tyr Thr He Gin Phe Leu Asp 
450 455 460 

Pro Val Ser Gly Ala Thr Cys Gin Asp Pro Asp Gly Lys Leu Lys Phe 
465 470 475 480 

Glu Val Ala Asn Asn Val Gin Ser Asp He Gly Lys Ala Leu Asp Phe 
485 490 495 

Glu Cys Thr Ser Leu Thr Arg Lys Asp Lys Tyr Leu He Leu Ala Gly 
500 505 510 



Asn Asn Gly Val Val Lys Lys Asn 
515 520 



<210> 16 
<211> 1506 
<212> DNA 

<213> Arabidopsis sp. 
<400> 16 

atgggagctc aggagaaacg gcgccgtttc 
cggtccaacc ataccgtggc cgctgatcta 
ttcccttact atttcctcgt agccctcgag 
cttgtgtccg taccattcgt ttatcttacg 
aacgtatttg tcttcatcac gttcgcgggt 
cgttccgtcc tcccgaggtt ctatgcggag 
aacacgttcg ggaaacggta cataataact 
gtgaaaacat. tcctaggagt tgataaagtt 
ggtcgggcaa ccgggttcac cagaaaacca 
gtcgttttga gagagtttgg tggcctagcg 
agcaagacgg accacgactt catgtccatc 



gagcagatat caaagtgcga tgttaaggac 60 
gacggaacac tactaatctc tcgtagcgcc 120 
gcagggagct tgctccgagc gttgatccta 180 
tacttgacca tctccgagac tttagccatc 240 
ctcaagatcc gagacgttga gctagtggtc 300 
gacgtgaggc ccgatacctg gcgtatcttc 360 
gcgagccctc gaattatggt cgagccattc 420 
cttggaacag agctagaggt ctccaaatcg 480 
ggtattctcg tcggtcagta caaacgtgac 540 
tctgatttac ctgatttggg gctcggcgat 600 
tgcaaggaag gttacatggt gccacgtacg 660 
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aaatgcgaac cattaccaag aaacaaactc ttaagcccca taatattcca cgagggcaga 720 
ttagtccaac gcccaacgcc gttagttgct ctgttaactt tcctctggct tcccgtcggt 780 
ttcgtcctct ctatcatccg cgtctacacg aatattccgt taccggaacg tatcgcccgt 840 
tacaactaca agcttactgg catcaagcta gtcgtcaacg gccaccctcc tccgccgcca 900 
aaacctggcc agccaggcca tcttttggtc tgcaaccacc gcaccgttct cgatcctgtg 960 
gtcacagctg tcgcactcgg ccggaaaatc agctgcgtca cttacagcat cagcaagttc 
1020 

tctgagctaa tctcaccaat caaagccgtt gcgttgactc gtcaacgtga gaaagacgca 
1080 

gcgaacatca agcgtctttt ggaggaaggc gatctcgtga tatgtcccga gggaaccacg 
1140 

tgccgtgagc ctttccttct ccggtttagt gctcttttcg ctgagctcac ggaccggatc 
1200 

gttcccgtgg cgatcaacac aaagcagagc atgttcaatg gtaccaccac acgtggatac 
1260 

aagcttcttg atccttactt tgcgttcatg aacccgaggc cgacgtatga gatcacgttc 
1320 

ctcaaacaga ttccagctga gctgacgtgt aaaggaggca aatctccgat agaggttgcg 
1380 

aattacatac agagggtttt gggaggaacc ttaggttttg agtgcaccaa tttcacaaga 
1440 

aaggataagt acgcaatgct tgctggtact gacggtaggg ttccggtgaa gaaggagaag 

1500 

acgtga 

1506 

<210> 17 
<211> 500 
<212> PRT 

<213> Arabidopsis sp. 
<400> 17 

Met Gly Ala Gin Glu Lys Arg Arg Arg Phe Glu Gin lie Ser Lys Cys 
1 5 10 15 

Asp Val Lys Asp Arg Ser Asn His Thr Val Ala Ala Asp Leu Asp Gly 
20 25 30 

Thr Leu Leu lie Ser Arg Ser Ala Phe Pro Tyr Tyr Phe Leu Val Ala 
35 40 45 

Leu Glu Ala Gly Ser Leu Leu Arg Ala Leu lie Leu Leu Val Ser Val 
50 55 60 

Pro Phe Val Tyr Leu Thr Tyr Leu Thr lie Ser Glu Thr Leu Ala lie 
65 70 75 80 

Asn Val Phe Val Phe lie Thr Phe Ala Gly Leu Lys He Arg Asp Val 
85 90 95 

Glu Leu Val Val Arg Ser Val Leu Pro Arg Phe Tyr Ala Glu Asp Val 
100 105 110 

Arg Pro Asp Thr Trp Arg He Phe Asn Thr Phe Gly Lys Arg Tyr He 
115 120 125 

He Thr Ala Ser Pro Arg He Met Val Glu Pro Phe Val Lys Thr Phe 
130 135 140 

Leu Gly Val Asp Lys Val Leu Gly Thr Glu Leu Glu Val Ser Lys Ser 
145 150 155 160 

Gly Arg Ala Thr Gly Phe Thr Arg Lys Pro Gly He Leu Val Gly Gin 
165 170 175 

Tyr Lys Arg Asp Val Val Leu Arg Glu Phe Gly Gly Leu Ala Ser Asp 
180 185 190 

Leu Pro Asp Leu Gly Leu Gly Asp Ser Lys Thr Asp His Asp Phe Met 
195 200 205 
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Ser He Cys Lys Glu Gly Tyr Met Val Pro Arg Thr Lys Cys Glu Pro 
210 215 220 

Leu Pro Arg Asn Lys Leu Leu Ser Pro He He Phe His Glu Gly Arg 
225 230 235 240 

Leu Val Gin Arg Pro Thr Pro Leu Val Ala Leu Leu Thr Phe Leu Trp 
245 250 255 

Leu Pro Val Gly Phe Val Leu Ser He He Arg Val Tyr Thr Asn He 
260 265 270 

Pro Leu Pro Glu Arg He Ala Arg Tyr Asn Tyr Lys Leu Thr Gly He 
275 280 285 

Lys Leu Val Val Asn Gly His Pro Pro Pro Pro Pro Lys Pro Gly Gin 
290 295 300 

Pro Gly His Leu Leu Val Cys Asn His Arg Thr Val Leu Asp Pro Val 
305 310 315 320 

Val Thr Ala Val Ala Leu Gly Arg Lys He Ser Cys Val Thr Tyr Ser 
325 330 335 

He Ser Lys Phe Ser Glu Leu He Ser Pro He Lys Ala Val Ala Leu 
340 345 350 

Thr Arg Gin Arg Glu Lys Asp Ala Ala Asn He Lys Arg Leu Leu Glu 
355 360 365 

Glu Gly Asp Leu Val He Cys Pro Glu Gly Thr Thr Cys Arg Glu Pro 
370 375 380 

Phe Leu Leu Arg Phe Ser Ala Leu Phe Ala Glu Leu Thr Asp Arg He 
385 390 395 400 

Val Pro Val Ala lie Asn Thr Lys Gin Ser Met Phe Asn Gly Thr Thr 
405 410 415 

Thr Arg Gly Tyr Lys Leu Leu Asp Pro Tyr Phe Ala Phe Met Asn Pro 
420 425 430 

Arg Pro Thr Tyr Glu lie Thr Phe Leu Lys Gin He Pro Ala Glu Leu 
435 440 # 445 

Thr Cys Lys Gly Gly Lys Ser Pro lie Glu Val Ala Asn Tyr He Gin 
450 455 460 

Arg Val Leu Gly Gly Thr Leu Gly Phe Glu Cys Thr Asn Phe Thr Arg 
465 470 475 480 

Lys Asp Lys Tyr Ala Met Leu Ala Gly Thr Asp Gly Arg Val Pro Val 
485 490 495 



Lys Lys Glu Lys 
500 



<210> 18 
<211> 1620 
<212> DNA 

<213> Arabidopsis sp. 



<400> 18 

atggcggatc ctgatctgtc ttctcctttg 
gttgttatct ctatcgccga cgacgacgac 
gttgttgacc ctcgtgtttc acgaggtttt 
ctcagcgagt cagagcctcc ggttctcggt 
acacctggag ttagcggatt gtacgaagcg 



atccaccatc aatcctccga tcaacctgaa 60 
gacgagtcag gactcaatct tcttccagcc 120 
gagtttgacc atcttaatcc ttatggcttt 180 
ccgacgacgg tggatccatt ccggaacaat 240 
attaagctcg tgatttgtct tccgattgct 300 
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ctgattagac ttgttctctt tgctgctagc ttagctgttg gttacttggc tacaaaattg 360 
gcacttgctg gctggaaaga taaagagaac cctatgcctc tttggagatg cagaatcatg 420 
tggattactc ggatctgtac cagatgtatc ctcttctctt ttggctatca gtggataaga 480 
aggaaaggga aacctgctcg gagagagatt gctccgattg ttgtatcaaa tcatgtttct 540 
tatattgaac caatcttcta cttctatgaa ttatcaccga ccattgttgc atcggagtca 600 
catgattcac ttccatttgt tggaactatt atcagggcaa tgcaggtgat atatgtgaat 660 
agattctcac agacatcaag gaagaatgct gtgcatgaaa taaagagaaa agcttcctgc 720 
gatagatttc ctcgtctgct gttattcccc gaaggaacca cgactaatgg gaaagttctt 780 
atttccttcc aactcggtgc tttcatccct ggttacccta ttcaacctgt agtagtccgg 840 
tatccccatg tacattttga tcaatcctgg ggaaatatct ctttgttgac gctcatgttt 900 
agaatgttca ctcagtttca caatttcatg gaggttgaat atcttcctgt aatctatccc 960 
agtgaaaagc aaaagcagaa tgctgtgcgt ctctcacaga agactagtca tgcaattgca 
1020 

acatctttga atgtcgtcca aacatcccat tcttttgcgg acttgatgct actcaacaaa 
1080 

gcaactgagt taaagctgga gaacccctca aattacatgg ttgaaatggc aagagttgag 
1140 

tcgctattcc atgtaagcag cttagaggca acgcgatttt tggatacatt tgtttccatg 
1200 

attccggact cgagtggacg tgttaggcta catgactttc ttcggggtct taaactgaaa 
1260 

ccttgccctc tttctaaaag gatatttgag ttcatcgatg tggagaaggt cggatcaatc 
1320 

actttcaaac agttcttgtt tgcctcgggc cacgtgttga cacagccgct ttttaagcaa 
1380 

acatgcgagc tagccttttc ccattgcgat gcagatggag atggctatat tacaattcaa 
1440 

gaactcggag aagctctcaa aaacacaatc ccaaacttga acaaggacga gattcgagga 
1500 

atgtaccatt tgctagacga cgaccaagat caaagaatca gccaaaatga cttgttgtcc 
1560 

tgcttaagaa gaaaccctct tctcatagcc atctttgcac ctgacttggc cccaacataa 
1620 

<210> 19 
<211> 539 
<212> PRT 

<213> Arabidopsis sp. 
<400> 19 

Met Ala Asp Pro Asp Leu Ser Ser Pro Leu lie His His Gin Ser Ser 
15 10 15 

Asp Gin Pro Glu Val Val lie Ser lie Ala Asp Asp Asp Asp Asp Glu 
20 * 25 30 

Ser Gly Leu Asn Leu Leu Pro Ala Val Val Asp Pro Arg Val Ser Arg 
35 40 45 

Gly Phe Glu Phe Asp His Leu Asn Pro Tyr Gly Phe Leu Ser Glu Ser 
50 55 60 

Glu Pro Pro Val Leu Gly Pro Thr Thr Val Asp Pro Phe Arg Asn Asn 
65 70 75 80 

Thr Pro Gly Val Ser Gly Leu Tyr Glu Ala lie Lys Leu Val lie Cys 
85 90 95 

Leu Pro lie Ala Leu lie Arg Leu Val Leu Phe Ala Ala Ser Leu Ala 
100 105 110 

Val Gly Tyr Leu Ala Thr Lys Leu Ala Leu Ala Gly Trp Lys Asp Lys 
115 120 125 

Glu Asn Pro Met Pro Leu Trp Arg Cys Arg lie Met Trp lie Thr Arg 
130 135 140 

lie Cys Thr Arg Cys lie Leu Phe Ser Phe Gly Tyr Gin Trp lie Arg 
145 150 155 160 
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Arg Lys Gly Lys Pro Ala Arg Arg Glu lie Ala Pro lie Val Val Ser 

165 170 175 

Asn His Val Ser Tyr He Glu Pro He Phe Tyr Phe Tyr Glu Leu Ser 
180 185 190 

Pro Thr He Val Ala Ser Glu Ser His Asp Ser Leu Pro Phe Val Gly 
195 200 205 

Thr He He Arg Ala Met Gin Val He Tyr Val Asn Arg Phe Ser Gin 
210 ~ 215 220 

Thr Ser Arg Lys Asn Ala Val His Glu He Lys Arg Lys Ala Ser Cys 
225 230 235 240 

Asp Arg Phe Pro Arg Leu Leu Leu Phe Pro Glu Gly Thr Thr Thr Asn 
245 250 255 

Gly Lys Val Leu He Ser Phe Gin Leu Gly Ala Phe He Pro Gly Tyr 
260 265 270 

Pro He Gin Pro Val Val Val Arg Tyr Pro His Val His Phe Asp Gin 
275 280 285 

Ser Trp Gly Asn He Ser Leu Leu Thr Leu Met Phe Arg Met Phe Thr 
290 295 300 

Gin Phe His Asn Phe Met Glu Val Glu Tyr Leu Pro Val He Tyr Pro 
305 310 315 320 

Ser Glu Lys Gin Lys Gin Asn Ala Val Arg Leu Ser Gin Lys Thr Ser 
325 330 335 

His Ala He Ala Thr Ser Leu Asn Val Val Gin Thr Ser His Ser Phe 
340 345 350 

Ala Asp Leu Met Leu Leu Asn Lys Ala Thr Glu Leu Lys Leu Glu Asn 
355 360 365 

Pro Ser Asn Tyr Met Val Glu Met Ala Arg Val Glu Ser Leu Phe His 
370 ~ 375 380 

Val Ser Ser Leu Glu Ala Thr Arg Phe Leu Asp Thr Phe Val Ser Met 
385 390 395 400 

He Pro Asp Ser Ser Gly Arg Val Arg Leu His Asp Phe Leu Arg Gly 
405 410 415 

Leu Lys Leu Lys Pro Cys Pro Leu Ser Lys Arg He Phe Glu Phe He 
420 425 430 

Asp Val Glu Lys Val Gly Ser He Thr Phe Lys Gin Phe Leu Phe Ala 
435 440 445 

Ser Gly His Val Leu Thr Gin Pro Leu Phe Lys Gin Thr Cys Glu Leu 
450 455 460 

Ala Phe Ser His Cys Asp Ala Asp Gly Asp Gly Tyr He Thr He Gin 
465 470 475 480 

Glu Leu Gly Glu Ala Leu Lys Asn Thr He Pro Asn Leu Asn Lys Asp 
485 490 495 

Glu He Arg Gly Met Tyr His Leu Leu Asp Asp Asp Gin Asp Gin Arg 
500 505 510 

He Ser Gin Asn Asp Leu Leu Ser Cys Leu Arg Arg Asn Pro Leu Leu 
515 520 525 

He Ala He Phe Ala Pro Asp Leu Ala Pro Thr 
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<210> 20 
<211> 1128 
<212> DNA 

<213> Arabidopsis sp. 



<400> 20 

atggaaaaaa 

ataatatgtc 

ttatcagctg 

ttctttggct 

gttatcttct 

cgaacagaag 

aatatcaaat 

cacctctttg 

cagatagttt 

ggcacagatt 

cttccgatac 

gaactgagtt 

ccatctttct 

cgtatcaacc 

acattccagc 

gaaggaacag 

gccttcacca 

1020 

tatgtctctt 
1080 

ccacttgttg 
1128 



agagtgtacc 
tgatggtgtt 
tagtgttgag 
cgtggctcgc 
ctggtgataa 
ttgattggat 
atgtgcttaa 
agtttattcc 
cgagttttaa 
acacagaggc 
tgaacaacgt 
gctcacttga 
tagacaacgt 
tgacccaaat 
tcaaagacca 
agaaagagtt 
ccatctgtac 



aaattctgat 
agtttcaaca 
gcttttcagc 
cttgtggcct 
ggttccttgc 
gtacttctgg 
gagtagtttg 
tgttgagagg 
ggatccccga 
taaatgccaa 
gctgctfcccc 
cgcagtttat 
ttatggaatt 
cccaaatcaa 
gctgctcaat 
caacacaaag 
acatctcacc 



aagttgtctc 
gcttttatga 
attcgctata 
ttcctctttg 
gaggatcgag 
gatcttgcac 
atgaaattac 
agatgggaag 
gacgctttat 
aggagtaaga 
aggacaaaag 
gatgtgacca 
gagccatcag 
gaaaaggaca 
gacttttact 
aagtacctca 
ttcttctcat 



tgattagagt 
tgttgatatt 
gccgtaaatg 
agaagattaa 
tattgctcat 
tgcgtaaagg 
ctctctttgg 
tcgatgaagc 
ggcttgctct 
aatttgctgc 
gtttcgtctc 
tcggttataa 
aagttcacat 
tcaatgcttg 
ccaatggtca 
taaactgttt 
caatgatttg 



gttaagaggt 60 
ctgggggttc 120 
tgtttccttc 180 
caaaaccaaa 240 
tgcaaaccac 300 
ccagattggg 3 60 
ttgggcgttt 420 
aaacttgaga 480 
tttccccgag 540 
tgaaaatggc 600 
ctgcttgcaa 660 
aacccgctgc 720 
ccacatccgt 780 
gttaatgaac 840 
tttccctaac 900 
ggcagtgatt 960 
gttcaggatt 



tggcctgtgt ctacttgacc tctgctacgc atttcaatct tcgttctgtt 
agactgcaaa aaattccctc aaattagtaa acaaataa 



<210> 21 
<211> 375 
<212> PRT 

<213> Arabidopsis sp. 
<400> 21 

Met Glu Lys Lys Ser Val Pro Asn Ser Asp Lys Leu Ser Leu lie Arg 
15 10 15 

Val Leu Arg Gly lie lie Cys Leu Met Val Leu Val Ser Thr Ala Phe 
20 25 30 

Met Met Leu He Phe Trp Gly Phe Leu Ser Ala Val Val Leu Arg Leu 
35 40 45 

Phe Ser He Arg Tyr Ser Arg Lys Cys Val Ser Phe Phe Phe Gly Ser 
50 55 60 

Trp Leu Ala Leu Trp Pro Phe Leu Phe Glu Lys He Asn Lys Thr Lys 
. 65 70 75 80 

Val He Phe Ser Gly Asp Lys Val Pro Cys Glu Asp Arg Val Leu Leu 
85 90 95 

He Ala Asn His Arg Thr Glu Val Asp Trp Met Tyr Phe Trp Asp Leu 
100 105 110 

Ala Leu Arg Lys Gly Gin He Gly Asn He Lys Tyr Val Leu Lys Ser 
115 120 125 

Ser Leu Met Lys Leu Pro Leu Phe Gly Trp Ala Phe His Leu Phe Glu 
130 135 140 

Phe He Pro Val Glu Arg Arg Trp Glu Val Asp Glu Ala Asn Leu Arg 
145 150 155 160 

Gin He Val Ser Ser Phe Lys Asp Pro Arg Asp Ala Leu Trp Leu Ala 
165 170 175 
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Leu Phe Pro Glu Gly Thr Asp Tyr Thr Glu Ala 
180 185 



Lys Cys Gin Arg Ser 
190 



Lys Lys Phe Ala Ala Glu Asn Gly Leu Pro lie 
195 200 



Leu Asn Asn Val Leu 
205 



Leu Pro Arg Thr Lys Gly Phe Val Ser Cys Leu 
210 215 



Gin Glu Leu Ser Cys 
220 



Ser Leu Asp Ala Val Tyr Asp Val Thr lie Gly 
225 230 235 



Tyr Lys Thr Arg Cys 
240 



Pro Ser Phe Leu Asp Asn Val Tyr Gly lie Glu 
245 250 



Pro Ser Glu Val His 
255 



lie His lie Arg Arg lie Asn Leu Thr Gin He 
260 265 



Pro Asn Gin Glu Lys 
270 



Asp He Asn Ala Trp Leu Met Asn Thr Phe Gin 
275 280 



Leu Lys Asp Gin Leu 
285 



Leu Asn Asp Phe Tyr Ser Asn Gly His Phe Pro 
290 295 



Asn Glu Gly Thr Glu 
300 



Lys Glu Phe Asn Thr Lys Lys Tyr Leu He Asn 
305 310 ~ 315 



Cys Leu Ala Val He 
320 



Ala Phe Thr Thr He Cys Thr His Leu Thr Phe 
325 330 



Phe Ser Ser Met He 
335 



Trp Phe Arg He Tyr Val Ser Leu Ala Cys Val 
340 345 



Tyr Leu Thr Ser Ala 
350 



Thr His Phe Asn Leu Arg Ser Val Pro Leu Val 
355 360 



Glu Thr Ala Lys Asn 
365 



Ser Leu Lys Leu Val Asn Lys 
370 375 



<210> 22 
<211> 1170 
<212> DNA 

<213> Arabidopsis sp. 
<400> 22 

atggtgattg ctgcagctgt catcgtgcct 
gctgtcaatc tctttcaggc agtttgctat 
tacagaaaaa ttaaccgggt ggttgcagaa 
gactggtggg ctggagttaa gatccaagtg 
ggcaaagaac atgctcttgt cgtttgtaat 
tggattctgg ctcagcggtc aggttgcctg 
tccaaattcc ttccagtcat aggctggtca 
agaaattggg ccaaggatga aagcactcta 
cctcgacctt tctggttagc cctttttgtg 
aaagccgcac aagagtatgc agcctcctct 
cctcgcacca aaggtttcgt gtcagctgtt 
tatgatatga cagtgactat tccaaaaacc 
aaaggacaac cttcagtggt gcatgttcac 
gaatcagatg acgcaattgc acagtggtgc 
ttagacaaac acatagctgc agacactttc 
cccataaagt cccttgcggt ggttctatca 
aagttcctac actgggcaca actcttttct 
1020 

ggtctaggta tcatcactct ctgtatgcag 
1080 

tcgaccccag ccaaagtcgt cccagccaag 
1140 



ttgggccttc tcttcttcat atctggtctc 60 
gtactcattc gaccactgtc taagaacaca 120 
accttgtggt tggagcttgt atggatagtt 180 
tttgctgata atgagacctt caatcgaatg 240 
caccgaagtg atattgattg gcttgtggga 300 
ggaagcgcat tagctgtaat gaagaagtct 360 
atgtggttct cggagtatct ctttctggaa 420 
aagtcaggtc ttcagcgctt gagcgacttc 480 
gagggaactc gctttacaga agccaaactt 540 
gaattgccta tccctcgaaa tgtgttgatt 600 
agtaatatgc gttcatttgt cccagcaatt 660 
tctccaccac ccacgatgct aagactattc 720 
atcaagtgtc actcgatgaa agacttacct 780 
agagatcagt ttgtggctaa ggatgctctg 840 
cccggtcaac aagaacagaa cattggccgt 900 
tgggcatgcg tactaactct tggagcaata 960 
tcatggaaag gtatcacgat atcggcgctt 

atcctgatac gctcgtctca gtcagagcgt 

ccaaaagaca atcaccaccc agaatcatcc 
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tcccaaacag aaacggagaa ggagaagtaa 
1170 

<210> 23 
<211> 389 
<212> PRT 

<213> Arabidopsis sp. 
<400> 23 

Met Val lie Ala Ala Ala Val lie Val Pro Leu Gly Leu Leu Phe Phe 
15 10 15 

He Ser Gly Leu Ala Val Asn Leu Phe Gin Ala Val Cys Tyr Val Leu 
20 25 30 

He Arg Pro Leu Ser Lys Asn Thr Tyr Arg Lys He Asn Arg Val Val 
35 40 45 

Ala Glu Thr Leu Trp Leu Glu Leu Val Trp He. Val Asp Trp Trp Ala 
50 55 60 

Gly Val Lys He Gin Val Phe Ala Asp Asn Glu Thr Phe Asn Arg Met 
65 70 75 80 

Gly Lys Glu His Ala Leu Val Val Cys Asn His Arg Ser Asp He Asp 
85 90 95 

Trp Leu Val Gly Trp He Leu Ala Gin Arg Ser Gly Cys Leu Gly Ser 
100 105 110 

Ala Leu Ala Val Met Lys Lys Ser Ser Lys Phe Leu Pro Val He Gly 
115 120 125 

Trp Ser Met Trp Phe Ser Glu Tyr Leu Phe Leu Glu Arg Asn Trp Ala 
130 135 140 

Lys Asp Glu Ser Thr Leu Lys Ser Gly Leu Gin Arg Leu Ser Asp Phe 
145 150- ' 155 160 

Pro Arg Pro Phe Trp Leu Ala Leu Phe Val Glu Gly Thr Arg Phe Thr 
165 170 175 

Glu Ala Lys Leu Lys Ala Ala Gin Glu Tyr Ala Ala Ser Ser Glu Leu 
180 185 190 

Pro He Pro Arg Asn Val Leu He Pro Arg Thr Lys Gly Phe Val Ser 
195 200 205 

Ala Val Ser Asn Met Arg Ser Phe Val Pro Ala He Tyr Asp Met Thr 
210 215 220 

Val Thr He Pro Lys Thr Ser Pro Pro Pro Thr Met Leu Arg Leu Phe 
225 230 235 240 

Lys Gly Gin Pro Ser Val Val His Val His He Lys Cys His Ser Met 
245 250 255 

Lys Asp Leu Pro Glu Ser Asp Asp Ala He Ala Gin Trp Cys Arg Asp 
260 265 270 

Gin Phe Val Ala Lys Asp Ala Leu Leu Asp Lys His He Ala Ala Asp 

275 280 285 r 

Thr Phe Pro Gly Gin Gin Glu Gin Asn He Gly Arg Pro He Lys Ser 
290 295 300 

Leu Ala Val Val Leu Ser Trp Ala Cys Val Leu Thr Leu Gly Ala He 
305 310 315 320 

Lys Phe Leu His Trp Ala Gin Leu Phe Ser Ser Trp Lys Gly He Thr 
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lie Ser Ala Leu Gly Leu Gly He 
340 

He Arg Ser Ser Gin Ser Glu Arg 
355 360 

Ala Lys Pro Lys Asp Asn His His 
370 375 

Thr Glu Lys Glu Lys 
385 



He Thr Leu Cys Met Gin He Leu 
345 350 

Ser Thr Pro Ala Lys Val Val Pro 
365 

Pro Glu Ser Ser Ser Gin Thr Glu 
380 



<210> 24 

<211> 269 

<212> DNA 

<213> Glycine max 

<400> 24 

gacccactga acgctctcat caccttcacg tggctcccct tcggcttcat cctctccatc 60 

ataagggtct acttcaacct ccctctccca gaacncattg tccgctacac ctacgagatg 120 

ctcggcatca acctcgtcat ccgcggccac cgccctcctc cgccttcccc cggcaccccc 180 

ggcaacctct acgtctgcaa ccaccgcacc gctctcgacc ccatcgtcat cgccattgcc 240 
ctcggccgca aggtctcctg cgtcaccta 269 

<210> 25 

<211> 242 

<212> DNA 

<213> Glycine max 

<400> 25 

tgatcttcca cgacggccgt ttcgtgcaga ggccagaccc actgaacgct ctcatcacct 60 
tcacgtggct ccccttcggc ttcatcctct ccatcataag ggtctacttc aaccttcctc 120 
tcccagaacg cattgtccgc tacacctacg agatgctcgg catcaacctc gtcatccgcg 180 
gccaccgccc tcctccgcct tcccccggca cccccggcaa cctctacgtc tgcaaccacc 240 
gc 242 

<210> 26 

<211> 272 

<212> DNA 

<213> Glycine max 

<400> 26 

gtttgttcaa aggccaactc ctctagcagc cctcttgacc ttcctatggt tgccaattgg 60 

catcatactc tccatnctta agggtctacc ttaacatccc tttgcctgaa agaattgctt 120 

ggtataacta taagctatta ggaatcagag ttattgtgaa gggtacccct ccaccacccc 180 

caaagaaggg tcaaagtggt gtcctatttg tttgtaacca ccgcacagtt ttagaccctg 240 

tggttactgc agttgcactt ggaagaaaaa tt 272 

<210> 27 

<211> 218 

<212> DNA 

<213> Glycine max 

<400> 27 

atagcacagg agggttacat ggtgcctccg agcaaatcag caaaggcagt cccacaggag 60 

cgtctgaaga gcagaatgat cttccacgac gggcgtttcg tgcagaggcc agacccaatg 120 

aatgccctca tcaccttcac atggctccct ttgggtttcg tcctctccat cataagggtc 180 

tacttcaacc tccctctccc agaacgcatc gtccgcta 218 

<210> 28 

<211> 270 

<212> DNA 

<213> Glycine max 

<400> 28 

gtgcctgttg ctgtgaactg caagcagaac atgttctttg gaaccaccgt tcgtggcgtc 60 
aagttctggg acccttaact tacttcttac atgaacccta ggcctgtgta cgaggttacc 120 
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ttaccttgat acctttgccg aggagatgtc ggttaaggct ggggggaagt cgtccattga 180 
ggtggccaac cacgtggcag aaggtgctgg gggatgtgtt agggtttgag tgcaccgggt 240 
tgactaggaa ggataagtat atgttgttgg 270 

<210> 29 

<211> 252 

<212> DNA 

<213> Glycine max 



<400> 29 

catgagggta ggtttgctca aaggccaact 
ctgccaattg gcatcatact ctccatctta 
agaattgttg gtacaactac aagctcttag 
caccgccccc aaagaagggt caaagtggtg 
agaccctgtt gt 



cctctagctg ccctcttgac cttcctatgg 60 
agggtctacc ttaacatccc tttgcctgaa 120 
gaatcagagt tattgtgaag ggtacccctc 180 
tctatttgtt tgtaaccacc gcacagtatt 240 

252 



<210> 30 

<211> 272 

<212> DNA 

<213> Glycine max 



<400> 30 

ctgggactgc cttaaacgat gcatggatct 
tccagaggga acacgcagta aagatggaag 
tgttgctgca aagac.aaatg caccagtagt 
catgcctgca ggaaaggagg gaatagtgaa 
acctattgtt ggaaaggatc ctgacatgtt 



tatcaagaaa ggagcctctg tttttttctt 60 
actaggcaca ttcaagaagg gtgctttcag 120 
accaattacc cttattggaa ctggtcaaat 180 
cataggttct gtgaaagtgg ttatacataa 240 
at 272 



<210> 31 

<211> 239 

<212> DNA 

<213> Glycine max 



<400> 31 

cgggaatcaa ggtcatcaga cttcaagggt 
gaagctcatc agaatgagtc tgctccatta 
aatggagagt tcctccttcc attcaagact 
cctgtgatat tacgatatca ttaccagaga 



gtttcagctg ttgtcactga cagaattcga 60 
atgatgttat ttccagaagg tacaaccaca 120 
ggtggttttt tggcaaaggc accggtactt 180 
tttagccctg cctgggattc catatctgg 239 



<210> 32 

<211> 242 

<212> DNA 

<213> Glycine max 



<400> 32 

gaacggcaac ggcaacagcg ttcgcgatga 
cttccgccga cagcatcgcc gatatggaga 
tgtacggcac catgggacgc ggcgagttgc 
cgttggtcac tcttctcccc attcgagtcg 
ac 



ccgtcctctg ctgaagccgg agcctccggt 60 
agaagttcgc cgcttacgtc cgccgctacg 120 
ctcccaagga gaagctcttg ctcggtttcg 180 
ttctcgccgt caccatattg ctcttttatt 240 

242 



<210> 33 

<211> 248 

<212> DNA 

<213> Glycine max 



<400> 33 

ttcttcttct ctcactctct aaaaccctaa 
natgactaat taattaatcc atcgatcaag 
gaagccgccg aacggcaacg gcaacagcgt 
gcctccggtc tccgccgaca gcatcgccga 
ccgcgacg 



ctctatacat ggaagggaaa nctcaaatct 60 
catggagtcc gaactcaaag acctcaattc 120 
tcgcgatgac cgtcctctgc tgaagccgga 180 
tatggagaag aagttcgccg cttacgtccg 240 

248 



<210> 34 

<211> 217 

<212> DNA 

<213> Glycine max 



<400> 34 

aaaaccctaa ttctatacat ggaagggaaa tctcaaatct aatgactaat taattaatcc 60 
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atcgatcaag catggagtcc gaactcaaag 
gcaacagcgt tcgcgatgac cgtcctctgc 
gcatcgccga tatggagaag aagttcgccg 

<210> 35 

<211> 257 

<212> DNA 

<213> Glycine max 



acctcaattc gaagccgccg aacggcaacg 120 
tgaagccgga gcctccggtc tccgccgaca 180 
cttacgt 217 



<400> 35 

atctctgtct ctgcatttcc ctccctaaaa 
aaatctaatg actaattaat caatcaatcg 
ccgaactcaa agacctcaat tcgaagccac 
gcgacgaccg tcctctgctg aagc.cggagc 
tggagaagaa gttcgcc 



ccctaattct acatttggaa aggaaatctc 60 
tattaataat ccatcgatca agtatggagt 120 
ccaactgcaa cggcaacgcc aacagcgttt 180 
ctccggcctc ctccgacagc atcgccgaga 240 

257 



<210> 36 

<211> 284 

<212> DNA 

<213> Glycine max 



<400> 36 

cccgaccaaa acaggttttt gtggccaatc 
aacagatgac tgcatttgct gttattatgc 
agagcaccat tntggagagt gtagggtgta 
gagaagttgt ggcaaggaaa ttgagggatc 
ttatatttcc tgaaggaact tgtgtaaata 



atacttccat gattgatttc attatcttag 60 
agaagcatcc tggatgggtt ggattattgc 120 
tctggttcaa ccgtacagag gcaaaggatc 180 
atgtcctggg agctaacaac aaccctcttc 240 
atcactactc gtca 284 



<210> 37 

<211> 246 

<212> DNA 

<213> Glycine max 

<400> 37 

ggagatccgc ataagcaaat caatcatcct 
cctccctaaa accctaattc tacatttgga 
caatcaatcg tattaataat ccatcgatca 
tcgaagccac ccaactgcaa cggcaacgcc 
aagccg 

<210> 38 

<211> 278 

<212> DNA 

<213> Glycine max 

<400> 38 

gttttctatt gccacgttgt ggaagcgtaa 
cgtcgagttc tgaattggac cttcacattg 
aacaagaacg gcatggcaag ctccgactgt 
ctgaggcagc acgtgccatt gtagatgata 
agaaccttgg aactggaatg tttatttgtt 

<210> 39 

<211> 312 

<212> DNA 

<213> Glycine max 



gttccttcct tatctctgtc tctgcatttc 60 
aaggaantct caaatctaat gataattaat 120 
agtatggagt ccgaactcaa agacctcaat 180 
aacagcgttt gcgacgaccg tcctctgctg 240 

246 



cgaagatgaa tggcattggg aaactcaaat 60 

aagattacct accttctgga tccagtgttc 120 

gtgatttgct agacatttct cctagtctat 180 

cattcacaag gtgcttcaag caaatcctcc 240 

tcctttgt 278 



<400> 39 

ttaactttgg cacattctcc ttttgttcat caatgtgtgt tgtaaattgt ncatttcctt 60 
cagaggtctt tggtaganat gatgtgcagt ttctgtggtg catcttggac tgnggntgtt 120 
aagnatcatg gacccaggcc tagcaggaga ccaaagcagg tttttgtagc caaccatact 180 
tcatgattga tntcattatn tnagaacaga tgactgcttt tgcngttatn atgcagaagc 240 
atcctggatg ggttggtaag cntacagnat gtcaacngtg tatnaaatat gntacacnnn 300 
acttgcgtct tc 312 

<210> 40 

<211> 255 

<212> DNA 

<213> Glycine max 
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<400> 40 

ggattattgn ngcanatgca gtcatctgtt 
gattggncac anaaacctgt yttttggttg 
naccccagtc catgatgcaa canaganact 
ganacgagaa ttgagcaatt tagagtacct 
tctattcatc aaagg 

<210> 41 

<211> 291 

<212> DNA 

<213> Glycine max 



ctaagataat ganatcnatc atggaagtat 60 
gatactaggt cttggcccat ggtacttgac 120 
gnacatcatc tccaccaaac ccctctgana 180 
tggtttgatg caagtcagta tattcaagtt 240 

255 



<400> 41 

caacctccca tgcaatcgct caccctctcc 
tcgcgtaaca aggatgaatg gcattgggaa 
tcacattgaa gattacctgc cttctggatc 
ccgcctgtgt gatttgctag acatttctcc 
agatgataca ttcacaaggt gcttcaagtc 

<210> 42 

<211> 284 

<212> DNA 

<213> Glycine max 



gtcacctgaa tctgttttct attccctccg 60 
actcaaatcg tcgagttctg aattggacct 120 
cagtgttcaa caagaacggc atggcaagct 180 
tagtctatct gaggcagcac gtgccattgt 240 
aaatcctcca gaaccttgga a 291 



<400> 42 

ctgcaaccta ccatgcaatt cctcacctga 
gtaacgaaga tgaatggcat tgggaaactc 
attgaagatt acctaccttc tggatccagt 
ctgtgtgatt tgctagacat ttctcctagt 
atacatcaca aggtgctcaa gtcaaatctc 

<210> 43 

<211> 268 

<212> DNA 

<213> Glycine max 



atccgttttc tattgccacg ttgtggaagc 60 
aaatcgtcga gttctgaatt ggaccttcac 120 
gttcaacaag aacggcatgg caagctccga 180 
ctatctgagg cagcacgtgc catgtagatg 240 
cagaaccttg gaat 284 



<400> 43 

ctgaagtatt ctcgtcctag cccaaagcat 
tcagtgctgc ggcgatggga ggaaaagtga 
tattcttagt aatgccattg cttcgacccc 
tatttatttt taacactttt attaaagata 
gtnccagtaa tttnttttcc aaaaaaaa 



agagaaaggn agcaacagaa ctttgctgag 60 
tgtgtacctt tatgtggtgt tgttcttaat 120 
tttttttgct tttgttttgt cattgctaac 180 
tggcatatat ncacttcagt anacaaagtt 240 

268 



<210> 44 

<211> 241 

<212> DNA 

<213> Glycine max 



<400> 44 

gancaaaatt gccctccatc actttccttg 
attccctcac ctgaatccgt tttctattgc 
gcattgggaa actcaaatcg tcgagttctg 
cttctggatc cagtgttcaa caagaacggc 
a 

<210> 45 

<211> 247 

<212> DNA 

<213> Glycine max 



ttagagttgg tttctgcnac ctaccatgca 60 
cacgttgtgg aagcgtaacg aagatgaatg 120 
aattggacct tcacattgaa gattacctac 180 
atggcaagct ccgactgtgt gatttgctag 240 

241 



<400> 45 

gtaggatgtc tgagatcctt gccccaatca 
aggatgcgaa aatgatgaaa aatttgctgg 
ggaccacatg tagagaacct tatttattga 
atgagattgt ccccgttggc agttgattcc 
tgganta 

<210> 46 
<211> 271 
<212> DNA 



aaacggtgcg gttaactaga aaccgcgacg 60 
ggcaagggga cctggtggtt tgtcctgaag 120 
ggttcagccc tctgttctca gagatgtgcg 180 
cagttatatg ttccacggaa ccactgctgg 240 

247 



WO 00/18889 



26 



PCT/US99/22231 



<213> Glycine max 



<400> 46 

tgcagggggg cttgttagag ccatagtttt 
aggaaaagag atggggttga agataatggt 
gagcttcaga gttggaaggt ccgttttgcc 
aatgtttgag gcactcaaaa aaggagggaa 
gatggtggaa agcttcttga gagagtattt 

<210> 47 

<211> 242 

<212> DNA 

<213> Glycine max 



ggttcttcta tacccttttg tttgtgtcgt 60 
catggcatgc ttcttcggga tcaaagcatc 120 
cnaattcttc tnggaggacg ttngtgcaga 180 
gacagtggga gttaccaatt taccccacgt 240 
a 271 



<400> 47 

ttcacagctg tcacgccgtn aacggaaaat 
caccgaatgc aacggaacga cnccgtgcga 
cctcatctcc cgtngctcgt tcccgtactt 
cctccgcggc ctcatgctnc tcctctccct 
ct 



ggcaacggcg agacgcagtt tcccgcctat 60 
ntctgtngnc gccgacctcg agggtacgct 120 
catgctcgtc gccgtcgaag ccggcagcnt 180 
tccgttcgtc atnatcgcct acctcttcat 240 

242 



<210> 48 

<211> 244 

<212> DNA 

<213> Glycine max 

<400> 48 

acatattctt cagttagctc ccccaaccta tacacttcac caccacacca caaccctacc 60 
ctctctctct gtcatggtca ttggaggagc cttccctcgt ttcgacccaa tcaccaaatg 120 
tagacccaag accgctccaa ccagaccatc gcctcggacc tcgatggcac cctccttgtc 180 
tcccggagtg ccttccccta ctacttcctc gtcgccctcg aagccggcag cgtcttccga 240 
gcct 

<210> 49 
<211> 230 
<212> DNA 
<213> Glycine max 



244 



60 



<400> 49 

caacattcca cctagctccc caatcacatc ttcaccacac cataaacctt cttaatttct 
ctcttcattt tctcctctat tgtcataatc atggggacct tccctcgctt cgacccaatc 120 
accacccaag accggtccaa ccagaccgtg gcctccgacc ttgacggcac cctcctcgtc 180 
tcccggagcg ccttccccta ctacctcctc gttgccctcg aagccggcag 230 



<210> 50 

<211> 265 

<212> DNA 

<213> Glycine max 



<400> 50 

ctggtgaata atcctaagtt atggagtctg tggtgtgtga gctagaaggc acgcttgtga 60 

aggacaagga tgcgttctca tacttcatgt tggttgcgtt tgaagcttca ggtttggttc 120 

gtttcgcctt gttgctaaca ctattgcccg tgattcggtt ccttgacatg gttggcatga 180 

acgatgcatc tctcaagcta ntnatcttcg tggctgtggc tggtgttcca aagtccgaga 240 
ttgaatcagt ggctagggca gtttt 265 

<210> 51 

<211> 252 

<212> DNA 

<213> Glycine max 

<400> 51 

ctggtgaata atcctaagtt atggagtctg tggtgtgtga gctagaaggc acgcttgtga 60 
aggacaagga tgcgttctca tacttcatgt tggttgcgtt tgaagcttca ggtttggttc 120 
gtttcgcctt gttgctaaca ctattgcccg tgattcggtt ccttgacatg gttggcatga 180 
acgatgcatc tctcaagcta atgatcttcg tggctgtggc tgggttccaa agtccgagat 240 
tgaatcagtg gc 252 



<210> 52 
<211> 218 



WO 00/18889 



27 



PCT/US99/22231 



<212> DNA 

<213> Glycine max 

<400> 52 

aactgcaact acaacaacat tcattcattc 
aacggcgaga cgcagtttac ccgcctatac 
tctgtggccg ccgacctcga cggtacgctc 
atgctcgtcg ccgtcgaagc cggcagcctc 



acagctgtca cgccgtgaac ggaaaatggc 60 
accgaatgca acggaacgac accgtgcgag 120 
ctcatntccc gtagctcgtt cccgtacttc 180 
ctccgcgg 218 



<210> 53 

<211> 262 

<212> DNA 

<213> Glycine max 

<400> 53 

ggttaaggac attgagatgg tcgnntcctc ggtgctgccc aagttctaca ccgaggacgt 60 
gcnccccgag agctggagag tcttcaatcc ttcgggaagc gttacattgt cactgctagt 120 
ctagggtgat ggtggagcan tttgttaaga cgtttcttgg ggctgataag gtgcttggga 180 
ctgagcttga ggccacgaaa tcggggaggt tcatgggttt gttaaggagc ctggtgtgct 240 
tgttggggag cacaagaaag tg 262 

<210> 54 

<211> 212 

<212> DNA 

<213> Glycine max 

<400> 54 

gcaactacaa caacattcat tcattcacag ctgtcacgcc gtgaacggaa aatggcaacg 60 
gcgagacgca gtttcccgcc tatcaccgaa tgcaacggaa cgacgccgtg cgagtctgtg 120 
gccgccgacc tcgacggtac gctcctcatc tcccgtagnc cgttcccgta cttcatgctc 180 
gtngccgtcg aagccggcag cctcctccgc gg 212 



<210> 55 

<211> 273 

<212> DNA 

<213> Glycine max 



<400> 55 

catggttttc ttgagcttct ttggcctcag 
tctggcaaag ttcttcttag aagatgttgg 
tgagagaaaa gtggcatcta gtaagttgcc 
ctatttaggg gttgatgctg ttatagcaag 
gggagttttt gagagtaaga agccaattaa 

<210> 56 

<211> 257 

<212> DNA 

<213> Glycine max 



aaaggacaca ttcagaacag gatcagctgt 60 
attggaaggc tttgaggccg taatatgttg 120 
aagggtcatg gttgaaaatt tcctcaagga 180 
agaattgaag tcctttagtg gcttcttttt 240 
aat 273 



<400> 56 

ctctcaaaaa aggagggaag acagtgggag 
gcttcttgag agagtatttg gacattgatt 
gtggatacta cgtaggattg atggatgaca 
aagaaggaaa aggatgctcc gacatgatcg 
atgatgattt tttctcc 



tcaccaatct accccatgtg atggtggaaa 60 
tcgttgtggg cagggagctg aaagttttct 120 
caaaaactat gcatgccttg gagctggtta 180 
gaatcacaag gtttcgcaac atacgcgacc 240 

257 



<210> 57 

<211> 240 

<212> DNA 

<213> Glycine max 



<400> 57 

gaactaagtg tgaaccacta ccaagaaaca 
gtaggtttgc tcaaaggcca actcctctag 
ttggcatcat actctccatc ttaagggtct 
cttggtacaa ctacaagctc ttaggaatca 



agcttttaag tccaattatt tttcatgagg 60 
ctgnnctctt gaccttccta tggctgccaa 120 
accttaacat ccctttgcct gaaagaattg 180 
gagttattgt gaagggtacc cctccaccgc 240 



<210> 58 
<211> 254 
<212> DNA 
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<213> Glycine max 



<400> 58 

cttggaataa gggtcattag gaagggtatc 
ggagtcctat ttgtatgcaa ccacaggaca 
ttaggaagga aaattagctg tgtcacatat 
ccaatcaaag ctgtggcact ctctagggag 
ttgcttgagg aagg 



cctccacccc cagcnaagaa gggccaaagt 60 
gttttagacc ctgtggttac agctgttgca 120 
agcataagca aattcactga aataatttca 180 
agggacaaag atgctgccaa catcaagang 240 

254 



<210> 59 

<211> 267 

<212> DNA 

<213> Glycine max 



<400> 59 

gccaganaga cttgcttggt acaactacaa 
tatccctcca cccccagcaa agaagggcca 
gacagtttta gaccctgtgg ttacagctgt 
atatagcata agcaaattca ctgaaataat 
gagagggacc nagatgctgc cnacatc 



gcttcttgga ataagggtca ttaggaaggg 60 
aagtggagtc ctatttgtat gcaaccacag 120 
tgcattagga aggaaaatta gctgtgtcac 180 
tcaccaatca aagctgtggc actctctagg 240 

267 



<210> 60 

<211> 261 

<212> DNA 

<213> Glycine max 



<400> 60 

gtaaccacag ggtctaaaac tgtgcggtgg 
tgcttatgct atatgtgaca cagctaattc 
gcactctcaa ggganngaga gaaagatgct 
gacttggtga tttgccctga aggcacaact 
cactatttgc tgaactcact g 



ttactgcagt tgcacttgnc nagaaaaatt 60 
actgnaataa tttcaccaat taaagctgtg 120 
gccaatatcc ngagactact tgaggaaggg 180 
tgtagagagc cttcctcttg aggttcagtg 240 

261 



<210> 61 

<211> 258 

<212> DNA 

<213> Glycine max 

<400> 61 

caaggagctc acatgcagtg gagggaaatc agctattgaa gttgcaaact acattcaaag 60 
ggttcttgca gggactttgg gatttgagtg cacaaatttg actaggaaga gcaaatatgc 120 
catgcttgca ggcacagatg ggacagttcc atctaaggag aaggcttgan aagggagaga 180 
aattaagttc tcccttttga ttattctgta ttggtgccca atgtgtttcc aaaacactta 240 
gaattatgat agaaataa 258 

<210> 62 

<211> 258 

<212> DNA 

<213> Glycine max 



<400> 62 

attggcataa tcctctccat cctaagggtc 
gcttgntaca actacaagct tcttggaata 
ccagcaaaga agggccaaag tggagcctat 
ctgtggttac agctgttgca ttaggaagga 
aattcactga aataattt 

<210> 63 

<211> 239 

<212> DNA 

<213> Glycine max 



tatctcaaca tccctctgcc agaaagactt 60 
agggtcatta ggaagggtat ccctccaccc 120 
ttgtatgcaa ccacaggaca gttttagacc 180 
aaattagctg tgtcacatat agcataagca 240 

258 



<400> 63 

cacttcacca ccacaccaca accctaccct 
tccctcgttt cgacccaatc accaaatgta 
cctcggacct cgatggcacc ctccttgtct 
tcgccctcga agccggcagc gtcttccgag 



ctctctctgt catggtcatt ggaggagcct 60 

gcacccaaga ccgctccaac cagaccatcg 120 

cccggagtgc cttcccctac tacttcctcg 180 

ccctccttct cttaaccttc gtccccttc 239 



<210> 64 
<211> 531 
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<212> DNA 

<213> Glycine max 

<400> 64 

ccgagaaccg gtctaaccaa accgtggcct 
ccagcgcatt tccttactac atgctggtcg 
ttgtcctcct tgcctccgtc cctttcgtgt 
ggccatcaag tccctgatct tcatcgcctt 
ggtcgcgtgc tcggtgctgc ccaagttcta 
acctatacac ttcaccacca caccacaacc 
ggagccttcc ctcgtttcga cccaatcacc 
accatcgcct cggacctcga tggcaccctc 
ttcctcgtcg ccctcgaagc cggcagcgtc 



cggacttgga cggcaccctc ctggtgtccc 60 
ccatcgaagc cggcagcttc ctccgtggcc 120 
attcacgtac atattcctct ccgagaccgc 180 
cgcgggcctg aaggtcaggg acgttgagat 240 
cgccgacata ttcttcagtt agctccccca 300 
ctaccctctc tctctgtcat ggtcattgga 360 
aaatgtagca cccaagaccg ctccaaccag 420 
cttgtctccc ggagtgcctt cccctactac 480 
ttccgagccc tccttctctt a 531 



<210> 65 

<211> 256 

<212> DNA 

<213> Glycine max 



<400> 65 

acatattctt cagttagctc ccccaaccta 
ctctctctct gtcatggtca ttggaggagc 
tagcacccaa gaccgctcca accagaccat 
ctcccggagt gccttcccct actacttcct 
agccctcctt ctctta 

<210> 66 

<211> 260 

<212> DNA 

<213> Glycine max 



tacacttcac caccacacca caaccctacc 60 
cttccctcgt ttcgacccaa tcaccaaatg 120 
cgcctcggac ctcgatggca ccctccttgt 180 
cgtcgccctc gaagccggca gcgtcttccg 240 

256 



<400> 66 

ccatccaaca tattcttcag ttagctcccc 
ccctaccctc tctctctgtc atggtcattg 
ccaaatgtag cacccaagac cgctccaacc 
tccttgtctc ccggagtgcc ttcccctact 
tcttccgagc cctccttctc 

<210> 67 

<211> 248 

<212> DNA 

<213> Glycine max 



caacctatac acttcaccac cacaccacaa 60 
gaggagcctt ccctcgtttc gacccaatca 120 
agactatcgc ctcggacctc gatggcaccc 180 
acttcctcgt cgccctcgaa gccggcagcg 240 

260 



<400> 67 

caccaaccaa acctcactct ccctttctcc cctgaccctc tccctgccat ggtcatggga 60 
gcctttggcc acttcgaacc ggtctccaaa tgcagcaccg agaaccggtc taaccaaacc 120 
gtggcctcgg acttggacgg caccctcctg gtgtccccca gcgcatttcc ttactacatg 180 
ctgggcgcca tcgaagccgg cagcttcctc cgtggccttg tcctccttgc ctccgtccct 240 
ttcgtgta 248 

<210> 68 

<211> 283 

<212> DNA 

<213> Glycine max 



<400> 68 

ttcttcccca ccatcacacc aancaaacct 
ttccgccact tcgaaccggt ttccaaatgc 
gcctcggact tggacggcac cctcctggtg 
gtcgccatcg aagccggcag cttcctccgt 
gtgtacttca cgtacatatt cttctccgag 

<210> 69 

<211> 258 

<212> DNA 

<213> Glycine max 



cactctncct ggccatggtc atgnnngcct 60 
agcaccgaaa accggtttaa ccaaaccgtg 120 
tcccctagcg cctttcctta ctacatgctc 180 
ggccttgtcc tccttggatc cgtccctttc 240 
accgcggcca tea 283 



<400> 69 

ctcttcttcc ccaccatcnn accaaccaaa cctcactctc cctgaccatg gtcatgggag 60 
cctttcgcca cttcgaaccg gtttccaaat gcagcaccga aaaccggttt aaccaaaccg 120 
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tggcctcgga cttggacggc accctcctgg tgtcccctag cgcctttcct tactacatgc 180 
tcgtcgccat cgaagccggc agcttcctcc gtggccttgt cctccttgga tccgtccctt 240 
tcgtgtactt cacgtaca 258 

<210> 70 

<211> 256 

<212> DNA 

<213> Glycine max 



<400> 70 

tgcaactaca acaacattca ttcattcaca 
ggcgagacgc agtttcccgc ctatcaccga 
ggccgccgac ctcgacggta cgc tec teat 
cgtcgccgtc gaagceggea gcntcctccg 
cgtcatcanc gectae 

<210> 71 

<211> 259 

<212> DNA 

<213> Glycine max 



gctgtcacgc cgtgaacgga aaatggcaac 60 
atgcaacgga acgacaccgt gcgagtctgt 120 
ctcccgtagc tcgttcccgt acttcatget 180 
cggcctcatc ctcctcctng ccantccgtt 240 

256 



<400> 71 

cttccccacc atcacaccan ggcnaacctc 
gecatngtea tgggancctt tggccacttc 
cggnctaacc aaaccgtggc cteggacttg 
tttccttact acatgetgge ngccatcgaa 
cttgcctccg tccctttcg 



antctccctt tctccacnga ccctctccct 60 
gaaceggtet ecaaatgeag caccgagaac 120 
gacggcaccc tcctggtgtc ccncagcgca 180 
gccggcagct tcctccgtgg ccttgtcctc 240 

259 



<210> 72 

<211> 249 

<212> DNA 

<213> Glycine max 



<400> 72 

ccaacatatt cttcagttag ctcccccaac 
accctctctc tctgtcatgg tcattggagg 
atgtagcacc caagaccgct ccaaccagac 
tgtctcccgg agtgccttcc cctactactt 
ncgagccct 



ctatacactt caccaccaca ccacaaccct 60 
agccttccct cgtttcgacc caatcaccaa 120 
catcgcctcg gacctcgatg gcaccctnct 180 
cctcgtcgcc ctcgaagccg geagegtett 240 

249 



<210> 73 

<211> 257 

<212> DNA 

<213> Glycine max 



<400> 73 

caaccctctt cttccccacc atcacaccaa 
cctctccctg ccatggtcat gggagccttt 
accgagaacc ggtctaacca aaccgtggcc 
cccagcgcat ntccttacta catgctggtc 
cttgtcctcc ttgcctg 



ncaaacctca ctctcccttt ctcccctgac 60 
ggccacttcg aaccggtctc caaatgeage 120 
teggacttgg acggcaccct cctggtgtcc 180 
gecatcgaag ccggcagctt cctccgtggc 240 

257 



<210> 74 

<211> 255 

<212> DNA 

<213> Glycine max 



<400> 74 

gecgaagacg tgcacccgga gagttggaga 
gtcaeggcta gtcctagggt gatggtggag 
aaggtgcttg ggactgaact tgaggccacc 
aagcctggtg tgcttgttgg ggagcataag 
aattacctga cttgg 



gtgttcaact etttegggaa gcgttacatt 60 
ccgtttgtta aggegtttet eggggctgae 120 
aaatcgggga cgttcactgg gtttgttaag 180 
aaagtggctc tggtgaagga gtttcagggt 240 

255 



<210> 75 

<211> 244 

<212> DNA 

<213> Glycine max 



<400> 75 
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caacaacatt cattcattca cagctgtcac 
gcagtttccc gcctatcacc gaatgcaacg 
acctcgacgg tacgctcctc atcncccgta 
tcgaagccgg cagcctcctc cgcggcctca 
gagg 

<210> 76 

<211> 240 

<212> DNA 

<213> Glycine max 



gccgtgaacg gaaaatggca acggcgagac 60 
gaacgacacc gtgcgagtct gtggccgccg 120 
gctcgttccc gtacttcatg ctcgtcgccg 180 
tgcnttcctg ggtttanttt gagnacccct 240 

244 



<400> 76 

gctggctacc ctcttcttcc ccaccatcac 
ggtcatggga gcctttncgc cacttcgaac 
ttnaccanac cgtggcctcg gncttggacg 
cttactacat gctcgtcgcc atcgaagccg 



accaatcaaa cctcactcta ccctggccat 60 
cggtttccaa atgcagcacc gaanaccggt 120 
gcaccctcct ggtgtcccct agcgcctttc 180 
gcagcttcct ccgtggcttg tcctccttgg 240 



<210> 77 

<211> 263 

<212> DNA 

<213> Glycine max 



<400> 77 

gtttctcggg gctgacaagg tgcttgggac tgaacttgag gccaccaaat cggggacgtt 60 
cactgggttt gttaagaagc ctggtgtgct tgttggggag cataagaaag tggctctggt 120 
gaaggagttt cagggtaatt tacctgactt gggtctaggt gatagtaaaa gtgattatga 180 
cttcatgtca atttgcaagg aagggtacat ggtgccaaga actaagtgtg aaccactacc 240 
aagaaacaag cttttaagtc caa 263 

<210> 78 

<211> 258 

<212> DNA 

<213> Glycine max 

<400> 78 

ggccacgaaa tcggggaggt tcactgggtt tgttaaggag cctggtgtgc ttgttgggga 60 

gcacaagaaa gtggctgttg tgaaggagtt tcagggtaat ttacctgact tgggactagg 120 

agatagtaaa agtgattatg acttcatgtc aatttgcaag gaagggtaca tggtgccaag 180 

gactaagtgt gaaccactac caagaaacaa acttttaagt ccaattattt ntcatgaggg 240 

taggtttgtt caaaggcc 258 

<210> 79 4 
<211> 260 
<212> DNA 
<213> Glycine max 

<400> 79 

ctcttcttcc ccaccatcac accaancaaa cctcactctc cctttctccc ctgaccctct 60 

ccctgccatg gtcatgggag cctttggcca cttcgaaccg gtctccaaat gcagcaccga 120 

gaaccggtct aaccaaaccg tggcctcgga cttggacggc accctcctgg tgtcccccag 180 

cgcatttcct tactacatgc tggtcgccat cgaagccggc agcttcctcc gtgggccttg 240 

tcctccttgc ctccgtccct 260 

<210> 80 

<211> 257 

<212> DNA 

<213> Glycine max 

<400> 80 

gggaacaaca acaaatggca ngaaccttat ctccttccaa cttggtgcat ttatccctgg 60 

atacccaatc cagcctgtaa ttgtacgcta tcctcatgtg cactttgacc aatcctgggg 120 

tcatgtntct ttgggaaagc ttatgttcag aatgttcact caatttcaca acttttttga 180 

ggtagaatat cttcctgtca tttatcccct ggatgataag gaaactgctg tancttntcg 240 
ggagaggact agccggg 257 



<210> 81 

<211> 272 

<212> DNA 

<213> Glycine max 
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<400> 81 

catacctttt gttggcacca ttattagagc 
accatcatca aggaagcagg ctgttaggga 
tcttgtgata aatttcctcg agtactatta 
aaccttatct ccttccaact tggtgcattt 
atacgctatc ctcatgtaca ctttgaccaa 



aatgcaggtc atatatgtta acagattctt 60 
aataaaggaa ctgaataaca gagaagggcc 120 
tttcccgagg gaacaacaac taatggcagg 180 
atccctggat acccaatcca gcctgtaatt 240 
tc 272 



<210> 82 

<211> 245 

<212> DNA 

<213> Glycine max 



<400> 82 

gggcatttca catactagag ttcatcccag tgaaaagaaa gtgggaggct gatgaatcaa 60 
tcatgcgcca tatgctttct acattcaagg atccacaaga tcctctctgg cttgcgcttt 120 
tcccagaagg cactgatttc actgagcaaa agtgccttcg gagtcaaaaa tatgctgctg 180 
aacataagtt accggttctg aaaaatgttt tacttccaag gacaaagggg cttctgtgcc 240 
gcttg 245 

<210> 83 
<211> 268 
<212> DNA 

<213> Glycine max , 
<400> 83 

cagtgtcctt cctttctgga caatgttttt ggtgttgacc cttcagaagt gcacctgcat 60 
gtgcggcgta ttccggtgga ggagattcca gcttctgaaa ccaaagctgc ttcttggtta 120 
atcgacacat tccagatcaa ggaccaattg ctttcggatt tcaagattca aggccatttc 180 
cctaaccaac taaatgaaaa tgaaatttct agatttaaga gcctactctc ttttatggtg 240 
atagtttctt ttactgccat gtttattt 268 



<210> 84 

<211> 265 

<212> DNA 

<213> Glycine max 



<400> 84 

gaaagagact gggcaaaaga tgaaacatca 
atgccattcc ctttctggtt ggcccttttt 
cttttacaag ctcaagagtt tgctgcttca 
attcctcgta ctaagggttt tgtcacagca 
catttatgat tgcacatatg cagtt 



ctgaagtcag gttttaggca tctagagcac 60 
gttgaaggaa ctcgtttcac gcagacaaag 120 
aaagggctgc ctatacctag aaatgttttg 180 
gnacaaagcc ttcggccatt tcgttccagc 240 



<210> 85 

<211> 265 

<212> DNA 

<213> Glycine max 



<400> 85 

gaaagagact gggcaaaaga tgaaacatca 
atgccattcc ctttctggtt ggcccttttt 
cttttacaag ctcaagagtt tgctgcttca 
attcctcgta ctaagggttt tgtcacagca 
catttatgat tgcacatatg cagtt 

<210> 86 
<211> 301 
<212> DNA 
<213> Zea mays 



ctgaagtcag gttttaggca tctagagcac 60 
gttgaaggaa ctcgtttcac gcagacaaag 120 
aaagggctgc ctatacctag aaatgttttg 180 
gnacaaagcc ttcggccatt tcgttccagc 240 

265 



<400> 86 

ctcgtcgtca agggcacccc gccgccgccg 
gtctgcaacc accgcaccgt gctcgacccc 
gtcagctgcg tcacctacag catctccaag 
gtcgcgctgt cgcgggaggc gacaaggacg 
gcgacctggt catctgcccc gagggnaaca 
9 



cccaagaagg gccacccggg cgtcctcttc 60 
gtcgaggtgg ccgtggcgct gcgccgcaag 120 
ttctccgagc tcatctcgcc catcaaggcc 180 
ccgagaacat ccgccgcctg ctggaggagg 240 
actgccgcga gcccttcctg ctgcgttcag 300 

301 



<210> 87 
<211> 309 
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<212> DNA 
<213> Zea mays 

<400> 87 

cgctcatgcg gtgtacatca acctgccgct gcccgagcgc atcgtctact acacctacaa 60 
gctcatgggc atcaggctcg tcgtcaaggg caccccgccg ccgccgccca agaagggcca 120 
cccgggcgtc ctcttcgtct gcaaccaccg caccgtgctc gaccccgtcg aggtggccgt 180 
ggcgctgcgc cgcaaggtca gctgcgtcac ctacagcatc tccaagttct ccgagctcat 240 
ctcgcccatc aaggccgtcg cgctgtcggg gaggcgacaa ggacgccgag aacatccgcc 300 
gcctgctgg 309 

<210> 88 
<211> 304 
<212> DNA 
<213> Zea mays 

<400> 88 

tggctgtgca ggaggcctac ctggtgacgt caaggaagta cagcccggtg cccaggaacc 60 

agctgctgag cccgctgatt cgtgcacgac ggccgcctcg tgcagcgccc gacgccgctc 120 

gtcgcgctcg tcaccttcct ctggatgccg ttcggcttcg cgctggcgct catgcgcgtg 180 

tacatcaacc tgccgctgcc cgagcgcatc gtctactaca cctacaagct catgggcatc 240 

aggctcgtcg tcaagggcac cccgccgccg ccgcccaaga agggccaccc gggcgtcctc 300 
ttcg 304 



<210> 89 
<211> 312 
<212> DNA 
<213> Zea mays 



<400> 89 

ggttcatcca cttgtgttgc tattngaccg 
caaagatttn gggctacggt gacaatctcc 
gagaatctgc ctccaaatag ctgtcctggt 
gatatttata cccttctaac tctagggagg 
tttatgttcc ctattatagg gtgggcaatg 
atggacagca gg 



gtaccgtagg agagcacagc actancatcg 60 
atgttctaca atcttnaggt cgaaggaatg 120 
gtctatgttg ctaaccatca gagcttcttg 180 
tgcttcaaat ttataagcaa gaccagcatc 240 
tatctcttgg gtgtgattcc tctgcggcgt 300 

312 



<210> 90 
<211> 264 
<212> DNA 
<213> Zea mays 



<400> 90 

ggtgctgtat ctgaaagaat ccatcgtgct 
ctcttcccct gagggcacaa ctacaaatgg 
ttttcttgca aaggcaccag ttcaaccagt 
tgcagcatgg gattccatgt caggggcacg 
aaattaccta gaggtggtcc gctt 



catcaacaga aaaatgcacc aatgatgcta 60 
ggattatctc cttccattca aaacaggtgc 120 
cattttgaga tatccttaca aaagatttaa 180 
tcatgtattt ctgctgctct gtcaatttgt 240 

264 



<210> 91 
<211> 212 
<212> DNA 
<213> Zea mays 



<400> 91 

aaatgtcttg gatgcatttt tgttcagcgg 
tcaggtgctg tatttgaaag aatccatcgt 
ctactcttcc ctgagggcac aactacaaat 
gcttttcttg caaaggcacc agttcaacca 



gagtcgaaaa caccagattt caaaggtgtt 60 
gctcatcaac agaaaaatgc accaatgatg 120 
ggggattatc tccttccatt caaaacaggt 180 
gt 212 



<210> 92 
<211> 267 
<212> DNA 
<213> Zea mays 



<400> 92 

gtctaaagaa atngaaaggc gtggggnaat 
tctttatcan atgtcagcct cttttcctag 
gcctctagtt ggtctcataa gcaaatgtct 
aatncanatt tcaaaggtgt ttaaggtgtg 



tgtgtctaat catgtntctt atgtggatat 60 
ttttgttgct aagagatcag tggntagatt 120 
tggatgcatt tttgttcagc gggagtnnaa 180 
gnatctgaaa gaatccatcg tgctcatcaa 240 



WO 00/18889 



34 



PCT/US99/22231 



cagaaaaatg caccaatgat gctactc 267 

<210> 93 
<211> 152 
<212> DNA 
<213> Zea mays 



<400> 93 

ctacaaatgg ggattacctt cttccattta 
tgcagccagt cattttgaaa tacccttaca 
atggagcacg tcatgtgtta ttgctgctct 



agactggagc ctttnttgca ggtgcaccag 60 
ggagatttag tccagcatgg gattcaatgg 120 
gt 152 



<210> 94 
<211> 274 
<212> DNA 
<213> Zea mays 



<400> 94 

aaaatataaa ttaatatggt cttaatccca 
caatttagtt ctttctaata ttgggctggc 
tggtagtagt ctacctggcg ctagacatga 
tttctgtaac agacagccga ggaacactta 
tgtaatgtgg cagtttattt gtttgaggag 



ccatataaat aacgttctct ttctgcaggg 60 

agagaagcgc gtgtaccatg cagcactgac 120 

gaaagatgat tgaaagacgt tgcgtcgctt 180 

aaaatgtaac tgtgtgcgtg tttttatacc 240 
gctg 274 



<210> 95 
<211> 295 
<212> DNA 
<213> Zea mays 



<400> 95 

aatagctatc aagtacaata aaatatttgt 
ttttacaatg cacttggtcc ggctgatgac 
cttacctcct caatatctga gggagggaga 
ggacatgata gctgctagag ctggactaaa 
caaccgtcct agtcccaaac acactgaaga 



tgatgccttt tggaacagta agaagcaatc 60 
atcatgggct gttgtgtgtg atgtttggta 120 
gacggcaatt gcatttgctg agagagtaag 180 
gaaggttcct tgggatggct atctgaaaca 240 
gaacaacgca tattgccgat ctgtc 295 



<210> 96 
<211> 273 
<212> DNA 
<213> Zea mays 



<400> 96 

gngccatctc accggcggcn ggcctgcggc cggcaaccgg aggcgatggc gagctngtct 60 
gtggtggcgg acatggagca ntaccgcccc aacctggagg actacctccc gcccgactcg 120 
ctcccgcagg aggcgcccag gaatctccat ctgcgcgatc tgcttgacat ctcgccggtg 180 
ctaaccgagg cagcgggtgc catagtcgat gattcattca cccgttgctt taagtcgaat 240 
tctccagaac catggaatgg aacatatatt tgt 273 

<210> 97 
<211> 127 
<212> DNA 
<213> Zea mays 



<400> 97 

ctcaatatct ganggaggga gagactgcaa ttgcgtttgc tgagagagta agggacatga 60 
tagcagctag agctggtctt aagaaggtcc cgtgggatgg ctatctgaag cacaaccgcc 120 
ctagtcc 127 



<210> 98 
<211> 286 
<212> DNA 
<213> Zea mays 



<400> 98 

gaaccgtacg cgcctcatta cgcccatcca 
nctcggcggc gtcgccatct ccancggcng 
gcgagctcgt ctgtggcggc ggacatggag 
ccgcccgant cgctcccgca ggaggcgacc 
atctcgccgg tgctaaccga ggcagcgggt 



cgtgctcgcc tctccccatc gcataatttt 60 
cnggcctgcn gccggcaacc ggaggcgatg 120 
ctggaccgcc ccaacctgga ggactacntc 180 
aggaatctcc atctgngcga tctgcttgan 240 
gccatagtcg atgatt 286 
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<210> 99 
<211> 308 
<212> DNA 
<213> Zea mays 



<400> 99 

cgccatctca tcggcggcgg gcgtgcggcc 
tcgtctgtgg cgccggacat ggagctggac 
gactcgnncc cgcagaggcg ccccggaatc 
cggtgctcac cgaggcagcg ggtgccattg 
caaattctcc agagccatgg aattggaaca 
ataataag 

<210> 100 
<211> 282 
<212> DNA 
<213> Zea mays 



ggcggcngag gcgaggngcg attggcgagc 60 
cgcccanacc tggaggacta nctcccgccc 120 
tccanctgcg cgatctgctg gacatcncgc 180 
tcgatgactc cttcacacgg ngctttaagt 240 
tatatctgtt ccccttatgt gctttggtgt 300 

308 



<400> 100 

cagaaactag angttagtca cagcatggca 
gagcaactat gcaatttaat gccatgctgt 
ctgtttggct actaggaaga ccgaggtaga 
canccaaatg acagagtaaa tgaaggtagg 
gttgttaaca caagttcctc tgggaaaatc 



ttaaattgtc atagtaaaca acancncact 60 
gactaacttc tagtttctgg cattaaatta 120 
gaagcaaata taagaatacc ctccaacgca 180 
gttcaccttc ttgaacatga ccgtatactg 240 
agagagggtt tt 282 



<210> 101 
<211> 282 
<212> DNA 
<213> Zea mays 



<400> 101 

ggcgcggctg gccgtggcgc tggtcctgcc 
acnggcatgt cgtggcggct caaagggtng 
gggcgctgnc agctgttcgt gtgcaacnac 
gtagcgtgga ccgggaaatg cgcgncgtgt 
tctcccccat ngncggaang tgcacctgan 



gtacagtact cgacgccgat cctggcngcg 60 
cgcccngngc ttgcnnngcc gtgctccggc 120 
cggacgctga tcgacccngt gtacgtgtcc 180 
nctacagnct gangcggntn tcggagctca 240 
accgggaacg gg 282 



<210> 102 
<211> 290 
<212> DNA 
<213> Zea mays 

<400> 102 

ggacgcggca ccatgcgcgc cgagctggcc 
accacgtgcc gggagccctt cctgctccgc 
aggatcgtgc ccgtggcgat gaactaccgc 
gggtggaaag ccatggaccc catcttcttc 
cgttcctgaa ccantccccg caaagcgacg 



agtggcgacg tggccgtgtg ccccgagggc 60 
ttctccaagc tcttcgcgga gctcagcgac 120 
gtggggctct tccacccgac gacggcgcgc 180 
ttcatgaacn gcggcccgtg tacgaggtga 240 
tgcgcggcgg ggaagagccc 290 



<210> 103 
<211> 279 
<212> DNA 
<213> Zea mays 



<400> 103 

acgaggtgac gttcctgaac cagctccccg 
ccgttgatgt agccaactac gttcagcgga 
ccaccctcac aaggaaggac aaatacacgg 
ccaagccggc ggcggcccgg aagccggctt 
tctgctccac taacaattac accttgccca 



cagaggcgac gtgcgcggcg gggaagagcc 60 
tactcgctgc cacgctcggg ttcgagtgca 120 
tgctcgccgg caacgacggc gtcctgaacg 180 
ggcagagccg cgtgaaggaa gtcctcgggt 240 
gatctggac 279 



<210> 104 
<211> 315 
<212> DNA 
<213> Zea mays 



<400> 104 

gcccgagcgc atcgtctact acacctacaa 
caccccgccg ccgccgccca agaagggcca 
caccgtgctc gaccccgtcg aggtggccgt 



gctcatgggc atcaggctcg tcgtcaaggg 60 
cccgggcgtc ctcttcgtct gcaaccaccg 120 
ggcgctgcgc cgcaangtca gctgcgtcac 180 
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tacagcatct ccaagttctc cgagctcatc tcgcccatca aggccgtagc agnaaagcag 240 
gtcgcaaatg gagcagnagc gagtcgatgg aagngaattg gcgactggtc atctgcncga 300 
aggnacactg cggag 315 

<210> 105 
<211> 314 
<212> DNA 
<213> Zea mays 

<400> 105 

cgagacaccg agcacgtact accagcaaga tggtggcgtc tcccagattc aagcccatcg 60 

aggagtgctg ctcggagggg cggtcggagc agacggtggc cgccgacctg gacggcacgc 120 

tgctcatctc caggagcgcg ttcccctact acctcctcgt ggctctcgag gccggcagcg 180 

tcctccgcgc cgcgctgctg ctcctgtccg tgccgttcgt ctacgtcacc tacgccttct 240 

tctccgagtc gctggccatc agcacgctgg tgtacatctc cgtggcgggg ctcaaggtgc 300 

gcanatcgag atgg 314 

<210> 106 

<211> 291 

<212> DNA 

<213> Zea mays 



<400> 106 

ctctgggtct ggggccgaga caccgagcac gtactaccag caagatggtg gcgtctccca 60 
gattcaagcc catcgaggag tgctgctcgg aggggcggtc ggagcagacg gtggccgccg 120 
acctggacgg cacgctgctc atntccagga gcgcgttccc ctactacctc ctcgtggctc 180 
tcgaggccgg cagcgtcctc cgcgccgcgc tgctgctcct gtccgtgccg ttcgtctacg 240 
tcacctacgc cttcttctcc gagtcgctgg ccatcagcac gctggtgtac a 291 

<210> 107 
<211> 300 
<212> DNA 
<213> Zea mays 

<400> 107 

gcacgcagca gtacgacgtc tctcctctgg gtctggggcc gagacaccga gcacgtacta 60 

ccagcaagat ggtggcgtct cccagattca agcccatcga ggagtgctgc tcggaggggc 120 

ggtcggagca gacggtggcc gccgacctgg acggcacgct gctcatctcc aggagcgcgt 180 

tcccctacta cctcctcgtg gctctcgagg ccggcagcgt cctccgcgcc gcgctgctgc 240 

tcctgtccgt gccgttcgtc tacgtcacct acgccttctt ctccgagtcg ctggccatca 300 

<210> 108 

<211> 284 

<212> DNA 

<213> Zea mays 

<400> 108 

gnggccgaga caccgagcac gtactaccag cangatggtg gcgtctccca gattcangcc 60 

antcgaggag tgctgctcgg aggggcggtc ggagcagacg gtggccgccg acctggacgg 120 

cacgctgctc atctccagga gcgcgttccc ctacnacctc ctcgtggctc tcgaggccgg 180 

cagcgtcctc cgcgccgcgc tgctgctcct gtccgtgccg ttcgtctacg tcactacgcc 240 

ttcttctccg agtcgctggc catcaanacg ctggtgtaca tctc 284 

<210> 109 
<211> 280 
<212> DNA 
<213> Zea mays 



<400> 109 

ctcctctggg tctggggccg agacaccgag 
ccagattcaa gcccatcgag gagtgctgct 
ccgacctgga cggcacgctg ctcatctcca 
ctctcgaggc cggcagcgtc ctccgcgccg 
acgtcaccta cgcnttnttc tccgagtcgc 



cacgtactac cagcaagatg gtggcgtctc 60 
cggaggggcg gtcggagcag acggtggccg 120 
ggagcgcgtt ccnctactac ctcctcgtgg 180 
cgctgctgct cctgtccgtn ccgttcgtct 240 
tggccatcag 280 



<210> 110 
<211> 287 
<212> DNA 
<213> Zea mays 
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<400> 110 

cgtctctcct ctgggtctgg ggccgagaca 
gtctcccaga ttcaagccca tcgaggagtg 
ggccgccgac ctggacggca gctgctcatc 
gtggctctcg aggccggcag cgtcctccgc 
gtctacgtca ctacggcttc ttctccgagt 

<210> 111 
<211> 286 
<212> DNA 
<213> Zea mays 



ccgagcacgt actaccagca agatggtggc 60 
ctgctcggag gggcggtcgg agcagacggt 120 
tccaggagcg cgttccccta ctacctcctc 180 
gccgcgctgc tgctcctgtc cgtgccgttc 240 
cgctggccat cagcacg 287 



<400> 111 

cgcacagtta cgacgtctct cctctgggtc 
gcaagatggt ggcgtctccc agattcaagc 
cggagcagac ggtggccgcc gacctggacg 
cctactactc ctcgtgctct cgaggccggc 
gtgcgttcgt ctagtcacta cgcttttctc 

<210> 112 
<211> 323 
<212> DNA 
<213> Zea mays 



tggggccgag acaccgagca cgtactacca 60 
ccatcgagga gtgctgctcg gaggggcggt 120 
gcacgctgct catctccagg agcgcgttcc 180 
aggtcctccg cgccgcgctg tgctcctgtc 240 
gancgtggca ataana 286 



<400> 112 

gttattccct gaaggtacca caacaaatgg 
attcatacct ggctaccctg ttcaacctgt 
tcaatcatgg gggnatatat cgttattaaa 
taatttcatg gaggtagagt accttcctgt 
tgcccttcat tttgcggagg ataccagcta 
aacttcctat tcatatggtg att 



gagattcctg atttcgttcc aacatggtgc 60 
tgttgtccgt tatccacatg tgcactttga 120 
gctcatgttt aagatgttca cccaatttca 180 
tgtctaccct cctgagatca agcaagagaa 240 
tgctatggca cgtgccctca atgtcttgcc 300 

323 



<210> 113 
<211> 312 
<212> DNA 
<213> Zea mays 



<400> 113 

cgataaggcc cttttcgaag agcttctacc 
tgtggcttca gcttgtctgg gtggtggact 
cagatgagga aacttacaga tcaatgggta 
ggagtgatat tgattggctc attggatgga 
gtacacttgc tgtcatgaag aagtcatcca 
ggtttgcaga gt 



gtcggatcaa cagattcttg gccgagctgc 60 
ggtgggcagg tgttaaggta caactgcatg 120 
aagagcatgc actcatcata tcaaatcatc 180 
tattggccca gcgttcaggg tgccttggaa 240 
agttccttcc agttattggc tggtcaatgt 300 

312 



<210> 114 
<211> 279 
<212> DNA 
<213> Zea mays 



<400> 114 

agtggggtct ccaaaggttg aaagacttcc ctagaccatt ttggctagct ctttttgttg 60 
agggtactcg ctttactcca gcaaagcttc tcgcagctca ggagtatgcg gcttcccagg 120 
gcttaccagc tcctagaaat gtacttattc cacgtaccaa gggatttgta tctgccgtaa 180 
gtattatgcg agattttgtt ccagccattt acgatacaac tgtaatagtt cctaaagatt 240 
cccctcaacc aacaatgctg cggattttga aagggcaat 279 

<210> 115 
<211> 304 
<212> DNA 
<213> Zea mays 

<400> 115 

cgtcaacgcc atccaggccg tcctatttgt gacgataagg cccttttcga agagcttcta 60 

ccgtcggatc aacagattct tggccgagct gctgtggctt cagcttgtct gggtggtgga 120 

ctggtgggca ggtgttaagg tacaactgca tgcagatgag gaaacttaca gatcaatggg 180 

taaagagcat gcactcatca tatcaaatca tcggagtgat attgattggc tcatggatgg 240 

atattggccc agcgttcagg gtgccttgga agtacattgc tgtcatgaag aagtcatcca 300 

agtt 304 
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<210> 116 
<211> 259 
<212> DNA 
<213> Zea mays 

<400> 116 

cttcctcctg tccggcctca tcgtcaacgc catccaggcc gtcctatttg tgacgataag 60 
gcccntttcg aagagcttct aacgtcggat caacagattc ntggccgagc tgctgtggct 120 
tcagcttgtc tgggtggtgg acnggtgggc aggtgttaag gtacaactgc atgcngatga 180 
ggaaacttac agatcnatgg gtanagagca tgcactcatc atatcaaatc atcggagtga 240 
tattgattgg cncattgga 259 

<210> 117 
<211> 235 
<212> DNA 
<213> Zea mays 

<400> 117 

attccacgta ccaagggatt tgtatctgct gtaagtatta tgcgagattt tgttccagcc 60 

atttatgata caactgtaat agttcctaaa gattcccctc aaccaacaat gctgcggatt 120 

ttgaaagggc aatcatcagt gatacatgtc cgcatgaaac gtcatgcaat gagtgagatg 180 

ccaaaatcag atgaggatgt ttcaaaatgg tgtaaagaca tttttgtggc aaagg 235 



<210> 118 
<211> 282 
<212> DNA 
<213> Zea mays 



<400> 118 

tgagatgcca aaatcagatg atgacgtttc aaaatggtgt aaagacattt ttgtgacaaa 60 
ggatgcctta ctggacaaac atttggcaac aggcactttc gatgaggaga ttagacctat 120 
cggccgccca gtgaaatcat tgctggtgac cctgttttgg tcgtgcctgc tgttgtttgg 180 
tgccatcgag ttcttcaagt ggacgcagct cctatcgaca tggagaggag tggcattcac 240 
tgccgcagga tggcgctcgt gacaggggtc atgcacgtct tc 282 

<210> 119 
<211> 166 
<212> DNA 
<213> Zea mays 

<400> 119 

ctggtgggca ggcgttaagg tacaactaca tgcggatgag gacacttacc gatcaatggg 60 
taaagagcat gcactcgtca tatcaaatca tcgaagtgat attgattggc ttattggatg 120 
gatattggcc cagcgctcag ggtgccttgg aagtacgctc gctgtc 166 

<210> 120 
<211> 234 
<212> DNA 
<213> Zea mays 



<400> 120 

agtcanccaa gntccttcca gtcattggct 
nggagaggag ctgggccaag gatgaaaaga 
acttccctag accatttngg ctagctcttn 
angnttntng aggnnncagn agnnncgggn 



ggtcaatgtg gtttgcagag tacctctttt 60 
cactaaagtg gggtctccaa aggttgaaag 120 
tttgtngagg gnantcgctt tactccagca 180 
ttcccanggg ttaacagncc cana 234 



<210> 121 
<211> 210 
<212> DNA 
<213> Zea mays 



<400> 121 

gtgagatgcn aaaatcagat gatgacgttt 
aaggatgcct tactggacaa acatttggca 
atcggccgcc cagtgaaatc atngctggtg 
ggtgccatcg agntcttcaa gtggacgcag 



caaaatggtg taaagacatt tttgtggaca 60 
acaggcactt tcgatgagga gattagacct 120 
accctgtnnt ggtcgtgcct gctgttgttt 180 

210 



<210> 122 
<211> 274 
<212> DNA 
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<213> Zea mays 



<400> 122 

acncccgaat ccgccgcgcg cgcnccgtcc 
cacagcagcc tatcgccgga gaaggaacgc 
tctgacccct ccgagatcgn aagcggcggc 
cccgctcggc ctcctcttcc tcctgtccgg 
atttgtgaca ataaggccct tttccaagag 



tcgtcgccgg cggaggcgcc cgcnaccgcc 60 
cgcggggagc ttttccacng ccatctcccg 120 
catggcgatc ccgctcgtgc tcgtcgtgct 180 
cctcatcgtc aacaccatcc aggccatcct 240 
cttg 274 



<210> 123 
<2U> 305 
<212> DNA 
<213> Zea mays 



<400> 123 

ttgcactgag gaaaggccat tagggatata 
agttgcctat ttttagctgg gcatttcaca 
gggagattga tgaagcaatt attcagaaca 
ctatctggtt ggcggttttt cctgaaggca 
gtcaagagta tgcttcagaa catggcttgc 
caagg 

<210> 124 
<211> 279 
<212> DNA 
<213> Zea mays 



tcaagtacat acataagagc agcttgatga 60 
tttttgagtt tatcccggta gaacggaaat 120 
agctatcaaa acttaagaac ccgagagatc 180 
cggattatac tgagaagaaa tgcatcatga 240 
ctatgctaga acatgtcctc cttccaaaga 300 

305 



<400> 124 

ccagattttc tggacaatgt gtatggcgtt 
atggttcagc tccatcacat ccccacaaca 
aggtttaggc agaaggacca gctcctggca 
aaaggaactg aaaggagatc tgtcgacgcc 
tatgcttgac ggccnatctg gtttgtacct 

<210> 125 
<211> 219 
<212> DNA 
<213> Zea mays 



gatccttctg aagtccacat ccacgtcaga 60 
gaagacaaga taacagaatg gatggncgag 120 
gatttcttca tgaaggggca tttcctgatg 180 
gagtgcctgg caaactttct taaccagtag 240 
aaactcttt 279 



<400> 125 

agattttntg gacaatgtgt atggngttga 
ggttcagctc catcacatcc ccacaacagn 
gtttaggcag aaggaccagc tcctggcaga 
aggaactgaa ggagatctgt cgacgccgaa 



tccttntgaa gtncacatcc acgtnagaat 60 
agacaagata acagaangga tggtagagag 120 
tttcttcatg aaggggcact ttcctgatga 180 
gtgcctggc 219 



<210> 126 
<211> 293 
<212> DNA 
<213> Zea mays 



<400> 126 

taccatagat gctgtgtacg acatcacgat 
ngacaacgtc tacngcgtgg ntccttcgga 
ctccgacata ncggcgtccg aaaaacgggg 
gcntnganna acgagctngc tgttcggggc 
cgaacgaaag ggaaaaaggg gaaccgaagg 



cgcntacaaa caccggcngc ngacatttct 60 
agtccacatc cacatcanca gcatccaggt 120 
tggctggcng gntnngtgga gcggttcaag 180 
tttctaccgc ggctggggcc aatttcnccc 240 
ggggaacctg ttngaacggg ncc 293 



<210> 127 
<211> 6 
<212> PRT 

<213> conserved sequence 
<400> 127 

Val Xaa Asn His Xaa Ser 
1 5 



<210> 128 
<211> 6 
<212> PRT 
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<400> 128 

Val Thr Tyr Ser Xaa Ser 
1 5 



<210> 129 
<211> 7 
<212> PRT 

<213> conserved sequence 
<400> 129 

Val Xaa Leu Thr Arg Xaa Arg 
1 5 



<210> 130 
<211> 5 
<212> PRT 

<213> conserved sequence 
<400> 130 

Cys Pro Glu Gly Thr 
1 5 



<210> 131 
<211> 5 
<212> PRT 

<213> conserved sequence 
<400> 131 

lie Val Pro Val Ala 
1 5 



<210> 132 
<211> 7 
<212> PRT 

<213> conserved sequence 
<400> 132 

Leu Xaa Xaa Gly Asp Leu Val 
1 5 



<210> 133 
<211> 6 
<212> PRT 

<213> conserved sequence 
<400> 133 

Phe Xaa Xaa Gly Ala Phe 
1 5 



<210> 134 
<211> 6 
<212> PRT 

<213> Synthetic Oligonucleotide 
<400> 134 

Val Ala Asn Xaa Xaa Gin 
1 5 



<210> 135 
<211> 30 
<212> DNA 
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<213> Synthetic Oligonucleotide 
<400> 135 

ccatccgctt caagggaacg acacccatca 30 

<210> 136 
<211> 31 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 136 

tccctgtctt gcttgatgaa cttaaagctt g 31 

<210> 137 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 137 

acagcaggag tgtctgatga tggcagattc 30 

<210> 138 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 138 

actggagttc cagccaaaaa tgcacctgtc 30 

<210> 139 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 139 

gatacaccct tgaaatcagg cgattttgct 30 

<210> 140 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 140 

ttgcaaattc aattcctgtt tcaccgggcc 30 

<210> 141 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 141 

gttttctgct attccagaag gcgtcaacaa 30 

<210> 142 
<211> 32 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 142 

cattgaagat ccgtccgtga agttncctta cc 32 

<210> 143 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 143 

tcgagctgtg atcgatgatt ggctgtgaag 30 



<210> 144 
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<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 144 

gtctcttcaa aaacacacac acacgtctct 

<210> 145 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 145 

gtctcttcaa aaacacacac acacgtctct 

<210> 146 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 146 

gtagagagcc ttacttgctt cggtttagtc 

<210> 147 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 147 

acgtcatcgt acctgttgct attgactcac 

<210> 148 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 148 

acttttccat tgtcagggac tcctcgacac 

<210> 149 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 149 

acggtgtagg aagggaaagg attcaaaagg 

<210> 150 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 150 

gcgatgaact acagagtcgg attcttcctc 

<210> 151 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 151 

ccggtttacg agattacgtt cttgaaccag 

<210> 152 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 152 

caatggagac aaggctcgaa agtgctaacc 
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<210> 153 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 153 

attctctgaa catagttcgc cacggtcatg 

<210> 154 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 154 

gaaatccaac gccttcccaa tatcactctg 

<210> 155 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 155 

cttcaacttt ccatcaggat cttggcacgt 

<210> 156 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 156 

accacttgtt agagacctta cctgcttagg 

<210> 157 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 157 

tcctacctac accatccaat ttctcgaccc 

<210> 158 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 158 

ctgcgtcaag tgagcaactc agttcttgca 

<210> 159 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 159 

tgggaagcag cacgttgttc agtatcggaa 

<210> 160 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 160 

tagcctctgt gtaatctgtg ccctcgggga 

<210> 161 
<211> 1702 
<212> DNA 

<213> Simmondsia chinensis 
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<400> 161 

gaattctagc ctctctcctc ctgcaattct acttgctttc tacgatcttt ccctctctct 60 
ctctaaaacc ttaaaattgg aatggaatcg tttaaaaata tgatcttttt gtaattgaat 120 
tagtataatt atatctgggt aatcttgaat ttgttggtga ggccatgggg atcccagctg 180 
cggctgtgat tgtaccgctt ggcttgctct tcttcttctc tggtctcttc atcaacttca 240 
ttcaggcaat ttgttttgtg ctcgtgcggc cactgtcaaa gnntacatac agaaggatta 300 
acagggtgct ggtggaattg ttgtggcttg agctgatatg gctcgtagat tggtgggcaa 360 
gtgttaagat caagttgttc acagatcctg atacctttcg gctaatgggt aaagagcatg 420 
cacttgtgat atcaaaccac agaagtgata ttgattggct tgttggatgg gtgttggccc 480 
agagatcagg ctgcctggga agcacactgg ctgtcatgaa gaaatcatca aagtttctcc 540 
cggtcatagg ttggtctatg tggttttctg agtacctttt tcttgagaga agctgggcca 600 
aggatgaaag cacattgaag ttaggtcttc aacgcctcaa ggactaccct ctgcctttct 660 
ggttggctct tttcgtagaa ggaacacgat ttacccaagc taaactttta gcagctcaag 720 
aatatgctac ttcaatggga ttgccagttc ctagaaatac tttgatccct cgtactaagg 780 
gatttgtttc agccgtgagc catatgcgtt cgtttgtccc ggccatatat gatgtaacgg 840 
tggccatccc taaatcttct tcgcagccta caatgctcag acttttcaaa ggccagccat 900 
ccacggttca tgtacacatc aagcgccgct cgatgaaaga tctccctgaa gcagcagatg 960 
atgttgcaca atggtgtcga gacacattcg tcgcaaagga tgcactcctg gacaagcata 
1020 

atgtagatga cactttcgga gatgagtatc tgcaggacac tggccggcct ttgaaatctc 
1080 

tctttgtagc agtctcttgg gcattgattc tcatcctggg aggtttgaaa ttcctacgat 
1140 

ggtcgtccct tctatcatca tggaaggggg tcgccttctc agccgcatgc cttgtgctcg 
1200 

tcaccattct tatgcagatc ttaatccaat tttctcaatc cgagcgctcg actcctgcta 
1260 

aggtagcccc aggaaagccc aagaacatgg tatcagaacc cacggaaacg caacgacata 
1320 

agcagcacta aaagtatata tggaccccaa ctaagaagat tcagacgcaa gccacagttg 
1380 

attcaactgt tcagaatgtc aaatatagtt tgagaaacaa aagatcaaga ttagctgatg 
1440 

aagagcctaa tgaacctaca tacttggatc tgtcgtcgcc accgtctgct gctagctcgt 
1500 

tatcagaatt cgtgattccg ggaccgatcc cggatcttag ccttctatgc atggattatg 
1560 

atagtatctt aaatttcttt aatgatgtac cggaattata atgttagtta attaggggga 
1620 

tgagcattgt ttgggtttat atcgtggtaa atccttgtat tgtttataag atttgaagaa 
1680 

aattcgattc gagtgctctg aa 
1702 

<210> 162 
<211> 387 
<212> PRT 

<213> Simmondsia chinensis 
<400> 162 

Met Gly lie Pro Ala Ala Ala Val He Val Pro Leu Gly Leu Leu Phe 
15 10 15 

Phe Phe Ser Gly Leu Phe He Asn Phe He Gin Ala He Cys Phe Val 
20 25 30 

Leu Val Arg Pro Leu Ser Lys Thr Tyr Arg Arg He Asn Arg Val Leu 
35 40 45 

Val Glu Leu Leu Trp Leu Glu Leu He Trp Leu Val Asp Trp Trp Ala 
50 55 60 

Ser Val Lys He Lys Leu Phe Thr Asp Pro Asp Thr Phe Arg Leu Met 
65 70 75 80 

Gly Lys Glu His Ala Leu Val He Ser Asn His Arg Ser Asp He Asp 
85 90 95 

Trp Leu Val Gly Trp Val Leu Ala Gin Arg Ser Gly Cys Leu Gly Ser 
100 105 110 
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Thr Leu Ala Val Met Lys Lys Ser Ser Lys Phe Leu Pro Val lie Gly 
115 120 125 

Trp Ser Met Trp Phe Ser Glu Tyr Leu Phe Leu Glu Arg Ser Trp Ala 
130 135 140 

Lys Asp Glu Ser Thr Leu Lys Leu Gly Leu Gin Arg Leu Lys Asp Tyr 
145 150 155 160 

Pro Leu Pro Phe Trp Leu Ala Leu Phe Val Glu Gly Thr Arg Phe Thr 
165 170 175 

Gin Ala Lys Leu Leu Ala Ala Gin Glu Tyr Ala Thr Ser Met Gly Leu 
180 185 190 

Pro Val Pro Arg Asn Thr Leu lie Pro Arg Thr Lys Gly Phe Val Ser 
195 200 205 

Ala Val Ser His Met Arg Ser Phe Val Pro Ala He Tyr Asp Val Thr 
210 215 220 

Val Ala He Pro Lys Ser Ser Ser Gin Pro Thr Met Leu Arg Leu Phe 
225 230 235 240 

Lys Gly Gin Pro Ser Thr Val His Val His He Lys Arg Arg Ser Met 
245 250 255 

Lys Asp Leu Pro Glu Ala Ala Asp Asp Val Ala Gin Trp Cys Arg Asp 
260 265 270 

Thr Phe Val Ala Lys Asp Ala Leu Leu Asp Lys His Asn Val Asp Asp 
275 280 285 

Thr Phe Gly Asp Glu Tyr Leu Gin Asp Thr Gly Arg Pro Leu Lys Ser 
290 295 300 

Leu Phe Val Ala Val Ser Trp Ala Leu He Leu He Leu Gly Gly Leu 
305 310 315 320 

Lys Phe Leu Arg Trp Ser Ser Leu Leu Ser Ser Trp Lys Gly Val Ala 
325 330 335 

Phe Ser Ala Ala Cys Leu Val Leu Val Thr He Leu Met Gin lie Leu 
340 345 350 

He Gin Phe Ser Gin Ser Glu Arg Ser Thr Pro Ala Lys Val Ala Pro 
355 360 365 - 

Gly Lys Pro Lys Asn Met Val Ser Glu Pro Thr Glu Thr Gin Arg His 
370 375 380 



Lys Gin His 
385 



<210> 163 
<211> 43 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 163 

aagcttgcat gcgtcgacac aatggttcat gcgaccaagt cag 43 



<210> 164 
<211> 35 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 164 

ggtaccgtcg actcacttct tggtgttgtt gatag 

<210> 165 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 165 

ggatccgcgg ccgcacaatg acgagcttta ctacttccct teat 

<210> 166 

<211> 38 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial . Sequence : Synthetic 
Oligonucleotide 

<400> 166 

ggatcccctg caggttagag atccattgat tetgeaat 

<210> 167 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 167 

ggatccgcgg ccgcataatg gaatcagagc tcaaagat 

<210> 168 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 168 

ggatcccctg caggtcattc ttctttctga tggaaatc 

<210> 169 
<211> 41 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 



<400> 169 

ggatccgcgg ccgcacaatg actcgttcac aagatgtttc a 
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<210> 170 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 170 

ggatcccctg caggtcactt ctcttccaat ctagccag 

<210> 171 
<211> 46 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 171 

ggatccgcgg ccgcacaatg tccggtaata agatctcgac tcttca 

<210> 172 

<211> 46 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
01 i gonuc 1 eo t ide 

<400> 172 

ggatcccctg caggttattt tttcttgaca actccgttat taccgg 

<210> 173 
<211> 39 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 173 

atatccgcgg ccgcacaatg gttatggagc aagctggaa 

<210> 174 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 174 

ggatcccctg caggtcaatg gagacaaggc tcgaaagt 

<210> 175 
<211> 42 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 



<400> 175 
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ggatccgcgg ccgcacaatg tccgccaaga tttcaatatt cc 

<210> 176 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence -.Synthetic 
Oligonucleotide 

<400> 176 

ggatcccctg caggttaatt tttcttaact actccatt 

<210> 177 
<211> 42 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 177 

ggatccgcgg ccgcacaatg ggagctcagg agaaacggcg cc 

<210> 178 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 178 

ggatcccctg caggtcacgt cttctccttc ttcaccgg 

<210> 179 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 179 

ggatccgcgg ccgcacaatg gcggatcctg atctgtcttc tcct 

<210> 1.80 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 180 

ggatcccctg caggttatgt tggggccaag tcaggtgcaa agat 

<210> 181 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 
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<400> 181 

ggatccgcgg ccgcaaaatg gaaaaaaaga gtgtaccaaa ttct 44 

<210> 182 
<211> 46 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 182 

ggatcccctg caggttattt gtttactaat ttgagggaat tttttg 46 

<210> 183 
<211> 36 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 183 

tcgacctgca ggaagcttaa ggatggtgat tgctgc 36 

<210> 184 
<211> 31 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 



<400> 184 

ggatccgcgg ccgcttactt ctccttctcc g 

<210> 185 
<211> 39 
<212> DNA 

<213> Artificial Sequence 



31 



<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 185 

ggatccgcgg ccgcacaatg tcttttaggg atgtcctag 

<210> 186 
<211> 41 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 186 

ggatcccctg caggtcaatc atccttaccc tttggtttac c 

<210> 187 
<211> 60 
<212> DNA 

<213> Artificial Sequence 



<220> 
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<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 187 

atgtctttta gggatgtcct agaaagagga gatgaatttt ctgtgcggta tttcacaccg 60 

<210>. 188 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 188 

tcaatcatcc ttaccctttg gtttaccctc tggaggcaga agattgtact gagagtgcac 60 

<210> 189 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 189 

ggatccgcgg ccgcacaatg aagcattccc aaaaataccg tagg 44 

<210> 190 
<211> 41 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 190 

ggatcccctg caggtcaatg attttttttc atcacaaata c 41 

<210> 191 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
01 igonucleotide 

<400> 191 

atgaagcatt cccaaaaata ccgtaggtat ggaatttatg ctgtgcggta tttcacaccg 60 

<210> 192 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 192 

tcaatgattt tttttcatca caaatacaag aataagaaaa agattgtact gagagtgcac 60 

<210> 193 
<211> 43 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 193 

ggatccgcgg ccgcacaatg ggttttgttg atttcttcga aac 43 

<210> 194 
<211> 45 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 194 

ggatcccctg caggttattt ggtctcaatt ttaatatttt tttgc 45 

<210> 195 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 195 

atgggttttg ttgatttctt cgaaacatat atggtcggtt ctgtgcggta tttcacaccg 60 

<210> 196 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence -.Synthetic 
Oligonucleotide 

<400> 196 

ttatttggtc tcaattttaa tatttttttg caaggactcg agattgtact gagagtgcac 60 

<210> 197 

<211> 44 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 197 

ggatccgcgg ccgcacaatg gaaaagtaca ccaattggag agac 44 

<210> 198 
<211> 42 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
01 i gonuc 1 eo t ide 

<400> 198 

ggatcccctg caggctactt cctcttttta cgttgatcgc tg 42 

<210> 199 
<211> 60 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 199 

atggaaaagt acaccaattg gagagacaat ggtacgggaa ctgtgcggta tttcacaccg 60 

<210> 200 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 200 

ctacttcctc tttttacgtt gatcgctgat atattccttc agattgtact gagagtgcac 60 

<210> 201 
<211> 41 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 201 

ggatccgcgg ccgcacaatg cctgcaccaa aactcacgga g 41 

<210> 202 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 202 

ggatcccctg caggctacgc atctccttct ttcccttc 38 

<210> 203 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 203 

atgcctgcac caaaactcac ggagaaatct gcctcttcca ctgtgcggta tttcacaccg 60 

<210> 204 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 204 

ctacgcatct ccttctttcc cttcttcttc ttcttcctct agattgtact gagagtgcac 60 



WO 00/18889 



53 



PCT/US99/22231 



<210> 205 
<211> 46 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 205 

ggatccgcgg ccgcacaatg tctgctcccg ctgccgatca taacgc 46 

<210> 206 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 206 

ggatcccctg caggtcattc tttcttttcg tgttctcttt tctg 44 

<210> 207 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 207 

atgtctgctc ccgctgccga tcataacgct gccaaaccta ctgtgcggta tttcacaccg 60 

<210> 208 

<211> 60 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 208 

tcattctttc ttttcgtgtt ctcttttctg tcttaccagc agattgtact gagagtgcac 60 

<210> 209 
<211> 49 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 209 

ggatccgcgg ccgcacaatg ctgcatcaaa aaatagctca taaagttcg 49 

<210> 210 
<211> 49 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 210 
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ggatcccctg caggtcaaaa aataaaacaa taaagtttat aaactaacc 49 

<210> 211 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 211 

atgctgcatc aaaaaatagc tcataaagtt cgaaaagtcg ctgtgcggta tttcacaccg 60 

<210> 212 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 212 

tcaaaaaata aaacaataaa gtttataaac taaccaaatt agattgtact gagagtgcac 60 

<210> 213 

<211> 41 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 213 

ggatccgcgg ccgcacaatg agtgtgatag gtaggttctt g 41 

<210> 214 
<211> 41 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 214 

ggatcccctg caggttaatg catctttttt acagatgaac c 41 

<210> 215 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 215 

atgagtgtga taggtaggtt cttgtattac ttgaggtccg ctgtgcggta tttcacaccg 60 

<210> 216 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 
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<400> 216 

ttaatgcatc ttttttacag atgaaccttc gttatgggta agattgtact gagagtgcac 60 

<210> 217 
<211> 381 
<212> PRT 

<213> Saccharomyces sp. 

<220> 

<400> 217 

Met Ser Phe Arg Asp Val Leu Glu Arg Gly Asp Glu Phe Leu Glu Ala 
15 10 15 

Tyr Pro Arg Arg Ser Pro Leu Trp Arg Phe Leu Ser Tyr Ser Thr Ser 
20 25 30 

Leu Leu Thr Phe Gly Val Ser Lys Leu Leu Leu Phe Thr Cys Tyr Asn 
35 40 45 

Val Lys Leu Asn Gly Phe Glu Lys Leu Glu Thr Ala Leu Glu Arg Ser 
50 55 60 

Lys Arg Glu Asn Arg Gly Leu Met Thr Val Met Asn His Met Ser Met 
65 70 75 80 

Val Asp Asp Pro Leu Val Trp Ala Thr Leu Pro Tyr Lys Leu Phe Thr 
85 90 95 

Ser Leu Asp Asn lie Arg Trp Ser Leu Gly Ala His Asn lie Cys Phe 
100 105 110 

Gin Asn Lys Phe Leu Ala Asn Phe Phe Ser Leu Gly Gin Val Leu Ser 
115 120 125 

Thr Glu Arg Phe Gly Val Gly Pro Phe Gin Gly Ser lie Asp Ala Ser 
130 135 140 

He Arg Leu Leu Ser Pro Asp Asp Thr Leu Asp Leu Glu Trp Thr Pro 
145 150 155 160 

His Ser Glu Val Ser Ser Ser Leu Lys Lys Ala Tyr Ser Pro Pro He 
165 170 175 

He Arg Ser Lys Pro Ser Trp Val His Val Tyr Pro Glu Gly Phe Val 
180 185 190 

. Leu Gin Leu Tyr Pro Pro Phe Glu Asn Ser Met Arg Tyr Phe Lys Trp 
195 200 205 

Gly He Thr Arg Met He Leu Glu Ala Thr Lys Pro Pro He Val Val 
210 215 220 

Pro He Phe Ala Thr Gly Phe Glu Lys He Ala Ser Glu .Ala Val Thr 
225 230 235 240 

Asp Ser Met Phe Arg Gin He Leu Pro Arg Asn Phe Gly Ser Glu He 
245 250 255 

Asn Val Thr He Gly Asp Pro Leu Asn Asp Asp Leu He Asp Arg Tyr 
260 265 270 

Arg Lys Glu Trp Thr His Leu Val Glu Lys Tyr Tyr Asp Pro Lys Asn 
275 280 285 

Pro Asn Asp Leu Ser Asp Glu Leu Lys Tyr Gly Lys Glu Ala Gin Asp 
290 295 300 

Leu Arg Ser Arg Leu Ala Ala Glu Leu Arg Ala His Val Ala Glu He 
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305 



310 



315 



320 



Arg Asn Glu Val Arg Lys Leu Pro Arg Glu Asp Pro Arg Phe Lys Ser 
325 330 335 

Pro Ser Trp Trp Lys Arg Phe Asn Thr Thr Glu Gly Lys Ser Asp Pro 
340 345 350 

Asp Val Lys Val lie Gly Glu Asn Trp Ala He Arg Arg Met Gin Lys 
355 360 365 

Phe Leu Pro Pro Glu Gly Lys Pro Lys Gly Lys Asp Asp 



<210> 218 
<211> 396 
<212> PRT 

<213> Saccharomyces sp. 

<220> 

<400> 218 

Met Lys His Ser Gin Lys Tyr Arg Arg Tyr Gly He Tyr Glu Lys Thr 
15 10 15 

Gly Asn Pro Phe He Lys Gly Leu Gin Arg Leu Leu He Ala Cys Leu 
20 25 30 

Phe He Ser Gly Ser Leu Ser He Val Val Phe Gin He Cys Leu Gin 
35 40 45 

Val Leu Leu Pro Trp Ser Lys He Arg Phe Gin Asn Gly He Asn Gin 
50 55 60 

Ser Lys Lys Ala Phe He Val Leu Leu Cys Met He Leu Asn Met Val 
65 70 75 80 

Ala Pro Ser Ser Leu Asn Val Thr Phe Glu Thr Ser Arg Pro Leu Lys 
85 90 95 

Asn Ser Ser Asn Ala Lys Pro Cys Phe Arg Phe Lys Asp Arg Ala He 
100 105 HO 

He He Ala Asn His Gin Met Tyr Ala Asp Trp He Tyr Leu Trp Trp 
115 120 125 

Leu Ser Phe Val Ser Asn Leu Gly Gly Asn Val Tyr He lie Leu Lys 
130 135 140 

Lys Ala Leu Gin Tyr He Pro Leu Leu Gly Phe Gly Met Arg Asn Phe 
145 150 155 160 

Lys Phe He Phe Leu Ser Arg Asn Trp Gin Lys Asp Glu Lys Ala Leu 
165 170 175 

Thr Asn Ser Leu Val Ser Met Asp Leu Asn Ala Arg Cys Lys Gly Pro 
180 185 190 

Leu Thr Asn Tyr Lys Ser Cys Tyr Ser Lys Thr Asn Glu Ser He Ala 
195 200 205 

Ala Tyr Asn Leu He Met Phe Pro Glu Gly Thr Asn Leu Ser Leu Lys 
210 215 220 

Thr Arg Glu Lys Ser Glu Ala Phe Cys Gin Arg Ala His Leu Asp His 
225 230 235 240 

Val Gin Leu Arg His Leu Leu Leu Pro His Ser Lys Gly Leu Lys Phe 



370 



375 



380 



245 



250 



255 
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Ala Val Glu Lys Leu Ala 
260 

lie Gly Tyr Ser Pro Ala 
275 



Pro Ser Leu Asp Ala lie Tyr Asp Val Thr 
265 270 

Leu Arg Thr Glu Tyr Val Gly Thr Lys Phe 

280 285 



Thr Leu Lys Lys lie Phe 
290 

Phe Tyr lie Arg Glu Phe 
305 310 

Glu Val Phe Phe Asn Trp 
325 

Leu Leu Glu Asp Tyr Tyr 
340 

Asn Asp Asn Gin Ser lie 
355 

His Glu Thr Leu Thr Pro 
370 



Leu Met Gly Val Tyr Pro Glu Lys Val Asp 
295 300 

Arg Val Asn Glu He Pro Leu Gin Asp Asp 
315 * 320 

Leu Leu Gly Val Trp Lys Glu Lys Asp Gin 
330 335 

Asn Thr Gly Gin Phe Lys Ser Asn Ala Lys 
345 350 

Val Val Thr Thr Gin Thr Thr Gly Phe Gin 
360 365 

Arg He Leu Ser Tyr Tyr Gly Phe Phe Ala 
375 380 



Phe Leu He Leu Val Phe 
385 390 



Val Met Lys Lys Asn His 
395 



<210> 219 
<211> 479 
<212> PRT 

<213> Saccharomyces sp. 
<220> 



<400> 219 

Met Gly Phe Val Asp Phe Phe Glu Thr Tyr Met Val Gly Ser Arg Val 
15 10 15 

Gin Phe Lys Gin Leu Asp He Ser Asp Trp Leu Ser Leu Thr Pro Arg 
20 25 30 

Leu Leu He Leu Phe Gly Tyr Phe Tyr Leu His Ser Phe Phe Thr Ala 
35 40 45 

He Asn Gin Phe Leu Gin Phe He Asn Thr Asn Ser Phe Cys Leu Arg 
50 55 60 

Leu His Leu Leu Tyr Asp Arg Phe Trp Ser His Val Pro He He Gly 
65 70 75 80 

Glu Tyr Lys He Arg Leu Leu Ser Arg Ala Leu Thr Tyr Ser Lys Leu 
85 90 95 

Lys He He Pro Thr Leu Asp Lys Val Leu Glu Ala He Glu He Trp 
100 105 HO 

Phe Gin Leu His Leu Val Glu Met Thr Phe Glu Lys Lys Lys Asn Val 
115 120 125 

Gin He Phe He Thr Glu Gly Ser, Asp Asp Leu Asn Phe Phe Lys Asp 
130 135 140 

Ser Lys Phe Gin Thr Thr Leu Met He Cys Asn His Arg Ser Val Asn 
145 150 155 160 

Asp Tyr Thr Leu He Asn Tyr Leu Phe Leu Lys Ser Cys Pro Thr Lys 
165 170 175 
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Phe Tyr Thr Lys Trp Glu Phe Leu Gin Lys Leu Arg Lys Gly Glu Asp 
180 185 190 

Leu Ala Glu Trp Pro Gin Leu Lys Phe Leu Gly Trp Gly Lys Met Phe 
195 200 205 

Asn Phe Pro Arg Leu Asp Leu Leu Lys Asn lie Phe Phe Lys Asp Glu 
210 215 220 

Thr Leu Ala Leu Ser Ser Asn Glu Leu Arg Asp He Leu Glu Arg Gin 
225 230 235 240 

Asn Asn Gin Ala He Thr He Phe Pro Glu Val Asn He Met Ser Leu 
245 250 255 

Glu Leu Ser He He Gin Arg Lys Leu His Gin Asp Phe Pro Phe Val 
260 265 270 

He Asn Phe Tyr Asn Leu Leu Tyr Pro Arg Phe Lys Asn Phe Thr Thr 
275 280 285 

Leu Met Ala Ala Phe Ser Ser He Lys Asn He Lys Arg Lys Lys Asn 
290 295 300 

Arg Asn Asn He He Lys Glu Ala Arg Tyr Leu Phe His Arg Glu Leu 
305 310 315 320 

Asp Lys Leu Val His Lys Ser Met Lys Met Glu Ser Ser Lys Val Ser 
325 330 335 

Asp Lys Thr Thr Pro Pro Met He Val Asp Asn Ser Tyr Leu Leu Thr 
340 345 350 

Lys Lys Glu Glu He Ser Ser Gly Lys Pro Lys Val Val Arg He Asn 
355 360 365 

Pro Tyr He Tyr Asp Val Thr He He Tyr Tyr Arg Val Lys Tyr Thr 
370 375 380 

Asp Ser Gly His Asp His Thr Asn Gly Asp Leu Arg Leu His Lys Gly 
385 390 395 400 

Tyr Gin Leu Glu Gin He Ser Pro Thr He Phe Glu Met He Gin Pro 
405 410 415 

Glu Met Glu Ser Glu Asn Asn He Lys Asp Lys Asp Pro He Val Val 
420 425 430 

Met Val Asn Val Lys Lys His Gin He Gin Pro Leu Leu Ala Tyr Asn 
435 440 445 

Asp Glu Ser Leu Glu Lys Trp Leu Glu Asn Arg Trp He Glu Lys Asp 
450 455 460 



Arg Leu He Glu Ser Leu Gin Lys Asn He Lys He Glu Thr Lys 
465 470 475 



<210> 220 
<211> 300 
<212> PRT 

<213> Saccharomyces sp. 
<400> 220 

Met Glu Lys Tyr Thr Asn Trp Arg Asp Asn Gly Thr Gly He Ala Pro 
15 10 15 

Phe Leu Pro Asn Thr He Arg Lys Pro Ser Lys Val Met Thr Ala Cys 
20 25 30 
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Leu Leu Gly He Leu Gly Val Lys Thr He He Met Leu Pro Leu He 
35 40 45 

Met Leu Tyr Leu Leu Thr Gly Gin Asn Asn Leu Leu Gly Leu He Leu 
50 55 60 

Lys Phe Thr Phe Ser Trp Lys Glu Glu He Thr Val Gin Gly He Lys 
65 70 75 80 

Lys Arg Asp Val Arg Lys Ser Lys His Tyr Pro Gin Lys Gly Lys Leu 
85 90 95 

Tyr He Cys Asn Cys Thr Ser Pro Leu Asp Ala Phe Ser Val Val Leu 
100 105 110 

Leu Ala Gin Gly Pro Val Thr Leu Leu Val Pro Ser Asn Asp He Val 
115 120 125 

Tyr Lys Val Ser He Arg Glu Phe He Asn Phe He Leu Ala Gly Gly 
. 130 135 140 

Leu Asp He Lys Leu Tyr Gly His Glu Val Ala Glu Leu Ser Gin Leu 
145 " 150 155 160 

Gly Asn Thr Val Asn Phe Met Phe Ala Glu Gly Thr Ser Cys Asn Gly 
165 170 175 

Lys Ser Val Leu Pro Phe Ser He Thr Gly Lys Lys Leu Lys Glu Phe 
180 185 190 

lie Asp Pro Ser He Thr Thr Met Asn Pro Ala Met Ala Lys Thr Lys 
195 200 205 

Lys Phe Glu Leu Gin Thr lie Gin He Lys Thr Asn Lys Thr Ala He 
210 215 220 

Thr Thr Leu Pro lie Ser Asn Met Glu Tyr Leu Ser Arg Phe Leu Asn 
225 230 235 240 

Lys Gly lie Asn Val Lys Cys Lys He Asn Glu Pro Gin Val Leu Ser 
245 250 255 

Asp Asn Leu Glu Glu Leu Arg Val Ala Leu Asn Gly Gly Asp Lys Tyr 
260 265 270 

Lys Leu Val Ser Arg Lys Leu Asp Val Glu Ser Lys Arg Asn Phe Val 
275 280 285 



Lys Glu Tyr He Ser Asp Gin Arg Lys Lys Arg Lys 
290 295 300 



<210> 221 
<211> 759 
<212> PRT 

<213> Saccharomyces sp. 
<400> 221 

Met Pro Ala Pro Lys Leu Thr Glu Lys Phe Ala Ser Ser Lys Ser Thr 
1 5 10 15 

Gin Lys Thr Thr Asn Tyr Ser Ser He Glu Ala Lys Ser Val Lys Thr 
20 25 30 

Ser Ala Asp Gin Ala Tyr lie Tyr Gin Glu Pro Ser Ala Thr Lys Lys 
35 40 45 

lie Leu Tyr Ser He Ala Thr Trp Leu Leu Tyr Asn lie Phe His Cys 
50 55 60 
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Phe Phe Arg Glu lie Arg Gly Arg Gly Ser Phe Lys Val Pro Gin Gin 
65 70 75 80 

Gly Pro Val lie Phe Val Ala Ala Pro His Ala Asn Gin Phe Val Asp 
85 90 95 

Pro Val lie Leu Met Gly Glu Val Lys Lys Ser Val Asn Arg Arg Val 
100 105 110 

Ser Phe Leu He Ala Glu Ser Ser Leu Lys Gin Pro Pro He Gly Phe 
115 120 125 

Leu Ala Ser Phe Phe Met Ala He Gly Val Val Arg Pro Gin Asp Asn 
130 135 140 

Leu Lys Pro Ala Glu Gly Thr He Arg Val Asp Pro Thr Asp Tyr Lys 
145 150 155 160 

Arg Val He Gly His Asp Thr His Phe Leu Thr Asp Cys Met Pro Lys 
165 170 175 

Gly Leu He Gly Leu Pro Lys Ser Met Gly Phe Gly Glu He Gin Ser 
180 185 190 

He Glu Ser Asp Thr Ser Leu Thr Leu Arg Lys Glu Phe Lys Met Ala 
195 200 205 

Lys Pro Glu He Lys Thr Ala Leu Leu Thr Gly Thr Thr Tyr Lys Tyr 
210 215 220 

Ala Ala Lys Val Asp Gin Ser Cys Val Tyr His Arg Val Phe Glu His 
225 230 235 240 

Leu Ala His Asn Asn Cys He Gly He Phe Pro Glu Gly Gly Ser His 
245 250 255 

Asp Arg Thr Asn Leu Leu Pro Leu Lys Ala Gly Val Ala He Met Ala 
260 265 270 

Leu Gly Cys Met Asp Lys His Pro Asp Val Asn Val Lys He Val Pro 
275 280 285 

Cys Gly Met Asn Tyr Phe His Pro His Lys Phe Arg Ser Arg Ala Val 
290 295 300 

Val Glu Phe Gly Asp Pro He Glu He Pro Lys Glu Leu Val Ala Lys 
305 310 315 320 

Tyr His Asn Pro Glu Thr Asn Arg Asp Ala Val Lys Glu Leu Leu Asp 
325 330 335 

Thr He Ser Lys Gly Leu Gin Ser Val Thr Val Thr Cys Ser Asp Tyr 
340 345 350 

Glu Thr Leu Met Val Val Gin Thr He Arg Arg Leu Tyr Met Thr Gin 
355 360 365 

Phe Ser Thr Lys Leu Pro Leu Pro Leu He Val Glu Met Asn Arg Arg 
370 375 380 

Met Val Lys Gly Tyr Glu Phe Tyr Arg Asn Asp Pro Lys He Ala Asp 
385 390 395 400 

Leu Thr Lys Asp He Met Ala Tyr Asn Ala Ala Leu Arg His Tyr Asn 
405 410 415 

Leu Pro Asp His Leu Val Glu Glu Ala Lys Val Asn Phe Ala Lys Asn 
420 425 430 
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Leu Gly Leu Val Phe Phe Arg Ser lie Gly Leu Cys lie Leu Phe Ser 
435 440 445 

Leu Ala Met Pro Gly lie lie Met Phe Ser Pro Val Phe lie Leu Ala 
450 455 460 

Lys Arg lie Ser Gin Glu Lys Ala Arg Thr Ala Leu Ser Lys Ser Thr 
465 470 475 480 

Val Lys lie Lys Ala Asn Asp Val lie Ala Thr Trp Lys lie Leu lie 
485 490 495 

Gly Met Gly Phe Ala Pro Leu Leu Tyr lie Phe Trp Ser Val Leu lie 
500 505 510 

Thr Tyr Tyr Leu Arg His Lys Pro Trp Asn Lys lie Tyr Val Phe Ser 
515 520 525 

Gly Ser Tyr lie Ser Cys Val lie Val Thr Tyr Ser Ala Leu He Val 
530 535 540 

Gly Asp He Gly Met Asp Gly Phe Lys Ser Leu Arg Pro Leu Val Leu 
545 550 555 560 

Ser Leu Thr Ser Pro Lys Gly Leu Gin Lys Leu Gin Lys Asp Arg Arg 
565 570 575 

Asn Leu Ala Glu Arg He He Glu Val Val Asn Asn Phe Gly Ser Glu 
580 585 590 

Leu Phe Pro Asp Phe Asp Ser Ala Ala Leu Arg Glu Glu Phe Asp Val 
595 600 605 

He Asp Glu Glu Glu Glu Asp Arg Lys Thr Ser Glu Leu Asn Arg Arg 
610 615 620 

Lys Met Leu Arg Lys Gin Lys He Lys Arg Gin Glu Lys Asp Ser Ser 
625 630 635 640 

Ser Pro He He Ser Gin Arg Asp Asn His Asp Ala Tyr Glu His His 
645 650 655 

Asn Gin Asp Ser Asp Gly Val Ser Leu Val Asn Ser Asp Asn Ser Leu 
660 665 670 

Ser Asn He Pro Leu Phe Ser Ser Thr Phe His Arg Lys Ser Glu Ser 
675 680 685 

Ser Leu Ala Ser Thr Ser Val Ala Pro Ser Ser Ser Ser Glu Phe Glu 
690 695 700 

Val Glu Asn Glu He Leu Glu Glu Lys Asn Gly Leu Ala Ser Lys He 
705 710 715 720 

Ala Gin Ala Val Leu Asn Lys Arg He Gly Glu Asn Thr Ala Arg Glu 
725 730- 735 

Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu 
740 745 750 

Glu Gly Lys Glu Gly Asp Ala 
755 

<210> 222 
<211> 743 
<212> PRT 

<213> Sac char oinyces sp. 
<400> 222 
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Met Ser Ala Pro Ala Ala Asp His Asn Ala Ala Lys Pro lie Pro His 
1 5 10 15 

Val Pro Gin Ala Ser Arg Arg Tyr Lys Asn Ser Tyr Asn Gly Phe Val 
20 25 30 

Tyr Asn He His Thr Trp Leu Tyr Asp Val Ser Val Phe Leu Phe Asn 
35 40 45 

He Leu Phe Thr He Phe Phe Arg Glu He Lys Val Arg Gly Ala Tyr 
50 55 60 

Asn Val Pro Glu Val Gly Val Pro Thr He Leu Val Cys Ala Pro His 
65 70 75 80 

Ala Asn Gin Phe He Asp Pro Ala Leu Val Met Ser Gin Thr Arg Leu 
85 90 95 

Leu Lys Thr Ser Ala Gly Lys Ser Arg Ser Arg Met Pro Cys Phe Val 
100 105 HO 

Thr Ala Glu Ser Ser Phe Lys Lys Arg Phe He Ser Phe Phe Gly His 
115 120 125 

Ala Met Gly Gly He Pro Val Pro Arg He Gin Asp Asn Leu Lys Pro 
130 135 140 

Val Asp Glu Asn Leu Glu He Tyr Ala Pro Asp Leu Lys Asn His Pro 
145 150 155 160 

Glu lie He Lys Gly Arg Ser Lys Asn Pro Gin Thr Thr Pro Val Asn 
165 170 175 

Phe Thr Lys Arg Phe Ser Ala Lys Ser Leu Leu Gly Leu Pro Asp Tyr 
180 185 190 

Leu Ser Asn Ala Gin He Lys Glu He Pro Asp Asp Glu Thr He He 
195 200 205 

Leu Ser Ser Pro Phe Arg Thr Ser Lys Ser Lys Val Val Glu Leu Leu 
210 215 220 

Thr Asn Gly Thr Asn Phe Lys Tyr Ala Glu Lys He Asp Asn Thr Glu 
225 230 235 240 

Thr Phe Gin Ser Val Phe Asp His Leu His Thr Lys Gly Cys Val Gly 
245 250 255 

He Phe Pro Glu Gly Gly Ser His Asp Arg Pro Ser Leu Leu Pro He 1 
260 265 270 

Lys Ala Gly Val Ala He Met Ala Leu Gly Ala Val Ala Ala Asp Pro 
275 280 285 

Thr Met Lys Val Ala Val Val Pro Cys Gly Leu His Tyr Phe His Arg 
290 295 300 

Asn Lys Phe Arg Ser Arg Ala Val Leu Glu Tyr Gly Glu Pro He Val 
305 310 315 320 

Val Asp Gly Lys Tyr Gly Glu Met Tyr Lys Asp Ser Pro Arg Glu Thr 
325 330 335 

Val Ser Lys Leu Leu Lys Lys He Thr Asn Ser Leu Phe Ser Val Thr 
340 345 350 

Glu Asn Ala Pro Asp Tyr Asp Thr Leu Met Val lie Gin Ala Ala Arg 
355 360 365 

Arg Leu Tyr Gin Pro Val Lys Val Arg Leu Pro Leu Pro Ala He Val 
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370 375 380 

Glu lie Asn Arg Arg Leu Leu Phe Gly Tyr Ser Lys Phe Lys Asp Asp 
385 390 395 400 

Pro Arg lie lie His Leu Lys Lys Leu Val Tyr Asp Tyr Asn Arg Lys 
405 410 415 

Leu Asp Ser Val Gly Leu Lys Asp His Gin Val Met Gin Leu Lys Thr 
420 425 430 

Thr Lys Leu Glu Ala Leu Arg Cys Phe Val Thr Leu lie Val Arg Leu 
435 440 445 

lie Lys Phe Ser Val Phe Ala He Leu Ser Leu Pro Gly Ser He Leu 
450 455 460 

Phe Thr Pro He Phe He He Cys Arg Val Tyr Ser Glu Lys Lys Ala 
465 470 475 480 

Lys Glu Gly Leu Lys Lys Ser Leu Val Lys He Lys Gly Thr Asp Leu 
485 490 495 

Leu Ala Thr Trp Lys Leu He Val Ala Leu He Leu Ala Pro He Leu 
500 505 510 

Tyr Val Thr Tyr Ser He Leu Leu He He Leu Ala Arg Lys Gin His 
515 520 525 

Tyr Cys Arg He Trp Val Pro Ser Asn Asn Ala Phe He Gin Phe Val 
530 535 540 

Tyr Phe Tyr Ala Leu Leu Val Phe Thr Thr Tyr Ser Ser Leu Lys Thr 
545 550 555 560 

Gly Glu He Gly Val Asp Leu Phe Lys Ser Leu Arg Pro Leu Phe Val 
565 570 575 

Ser He Val Tyr Pro Gly Lys Lys He Glu Glu He Gin Thr Thr Arg 
580 585 590 . 

Lys Asn Leu Ser Leu Glu Leu Thr Ala Val Cys Asn Asp Leu Gly Pro 
595 600 605 

Leu Val Phe Pro Asp Tyr Asp Lys Leu Ala Thr Glu He Phe Ser Lys 
. 610 615 620 

Arg Asp Gly Tyr Asp Val Ser Ser Asp Ala Glu Ser Ser He Ser Arg 
625 630 635 640 

Met Ser Val Gin Ser Arg Ser Arg Ser Ser Ser He His Ser He Gly 
645 650 655 

Ser Leu Ala Ser Asn Ala Leu Ser Arg Val Asn Ser Arg Gly Ser Leu 
660 665 670 

Thr Asp He Pro He Phe Ser Asp Ala Lys Gin Gly Gin Trp Lys Ser 
675 680 685 

Glu Gly Glu Thr Ser Glu Asp Glu Asp Glu Phe Asp Glu Lys Asn Pro 
690 695 700 

Ala He Val Gin Thr Ala Arg Ser Ser Asp Leu Asn Lys Glu Asn Ser 
705 710 715 720 

Arg Asn Thr Asn He Ser Ser Lys He Ala Ser Leu Val Arg Gin Lys 
725 730 735 

Arg Glu His Glu Lys Lys Glu 
740 
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<210> 223 
<211> 397 
<212> PRT 

<213> Saccharomyces sp. 
<400> 223 

Met Leu His Gin Lys He Ala His Lys Val Arg Lys Val Val Val Pro 
1 5 10 15 

Gly He Ser Leu Leu He Phe Phe Gin Gly Cys Leu He Leu Leu Phe 
20 25 30 

Leu Gin Leu Thr Tyr Lys Thr Leu Tyr Cys Arg Asn Asp He Arg Lys 
35 40 45 

Gin He Gly Leu Asn Lys Thr Lys Arg Leu Phe He Val Leu Val Ser 
50 55 60 

Ser He Leu His Val Val Ala Pro Ser Ala Val Arg He Thr Thr Glu 
65 70 75 80 

Asn Ser Ser Val Pro Lys Gly Thr Phe Phe Leu Asp Leu Lys Lys Lys 
85 90 95 

Arg He Leu Ser His Leu Lys Ser Asn Ser Val Ala He Cys Asn His 
100 105 110 

Gin He Tyr Thr Asp Trp He Phe Leu Trp Trp Leu Ala Tyr Thr Ser 
115 120 125 

Asn Leu Gly Ala Asn Val Phe He He Leu Lys Lys Ser Leu Ala Ser 
130 135 140 

He Pro lie Leu Gly Phe Gly Met Arg Asn Tyr Asn Phe He Phe Met 
145 150 155 160 

Ser Arg Lys Trp Ala Gin Asp Lys He Thr Leu Ser Asn Ser Leu Ala 
165 170 175 

Gly Leu Asp Ser Asn Ala Arg Gly Ala Gly Ser Leu Ala Gly Lys Ser 
180 185 190 

Pro Glu Arg He Thr Glu Glu Gly Glu Ser He Trp Asn Pro Glu Val 
195 200 205 

He Asp Pro Lys Gin He His Trp Pro Tyr Asn Leu He Leu Phe Pro 
210 215 220 

Glu Gly Thr Asn Leu Ser Ala Asp Thr Arg Gin Lys Ser Ala Lys Tyr 
225 230 235 240 

Ala Ala Lys lie Gly Lys Lys Pro Phe Lys Asn Val Leu Leu Pro His 
245 250 255 

Ser Thr Gly Leu Arg Tyr Ser Leu Gin Lys Leu Lys Pro Ser He Glu 
260 265 270 

Ser Leu Tyr Asp lie Thr He Gly Tyr Ser Gly Val Lys Gin Glu Glu 
275 280 285 

Tyr Gly Glu Leu He Tyr Gly Leu Lys Ser lie Phe Leu Glu Gly Lys 
290 295 300 

Tyr Pro Lys Leu Val Asp lie His lie Arg Ala Phe Asp Val Lys Asp 
305 310 315 320 

He Pro Leu Glu Asp Glu Asn Glu Phe Ser Glu Trp Leu Tyr Lys lie 
325 330 335 
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Trp Ser Glu Lys Asp Ala Leu Met Glu Arg Tyr Tyr Ser Thr Gly Ser 
340 345 350 

Phe Val Ser Asp Pro Glu Thr Asn His Ser Val Thr Asp Sef Phe Lys 
355 360 365 

He Asn Arg He Glu Leu Thr Glu Val Leu He Leu Pro Thr Leu Thr 
370 375 380 

He He Trp Leu Val Tyr Lys Leu Tyr Cys Phe He Phe 
385 390 395 



<210> 224 
<211> 303 
<212> PRT 

<213> Saccharomyces sp. 
<400> 224 

Met Ser Val He Gly Arg Phe Leu Tyr Tyr Leu Arg Ser Val Leu Val 
15 10 15 

Val Leu Ala Leu Ala Gly Cys Gly Phe Tyr Gly Val He Ala Ser He 
20 25 30 

Leu Cys Thr Leu He Gly Lys Gin His Leu Ala Gin Trp He Thr Ala 
35 40 45 

Arg Cys Phe Tyr His Val Met Lys Leu Met Leu Gly Leu Asp Val Lys 
50 55 60 

Val Val Gly Glu Glu Asn Leu Ala Lys Lys Pro Tyr lie Met He Ala 
65 70 75 80 

Asn His Gin Ser Thr Leu Asp He Phe Met Leu Gly Arg lie Phe Pro 
85 90 95 

Pro Gly Cys Thr Val Thr Ala Lys Lys Ser Leu Lys Tyr Val Pro Phe 
100 105 HO 

Leu Gly Trp Phe Met Ala Leu Ser Gly Thr Tyr Phe Leu Asp Arg Ser 
115 120 125 

Lys Arg Gin Glu Ala lie Asp Thr Leu Asn Lys Gly Leu Glu Asn Val 
130 135 140 

Lys Lys Asn Lys Arg Ala Leu Trp Val Phe Pro Glu Gly Thr Arg Ser 
145 150 155 160 

Tyr Thr Ser Glu Leu Thr Met Leu Pro Phe Lys Lys Gly Ala Phe His 
165 170 175 

Leu Ala Gin Gin Gly Lys lie Pro lie Val Pro Val Val Val Ser Asn 
180 185 190 

Thr Ser Thr Leu Val Ser Pro Lys Tyr Gly Val Phe Asn Arg Gly Cys 
195 200 - 205 

Met He Val Arg He Leu Lys Pro He Ser Thr Glu Asn Leu Thr Lys 
210 215 220 

Asp Lys lie Gly Glu Phe Ala Glu Lys Val Arg Asp Gin Met Val Asp 
225 230 235 240 

Thr Leu Lys Glu lie Gly Tyr Ser Pro Ala lie Asn Asp Thr Thr Leu 
245 250 255 

Pro Pro Gin Ala lie Glu Tyr Ala Ala Leu Gin His Asp Lys Lys Val 
260 265 270 
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Asn Lys Lys lie Lys Asn Glu Pro Val Pro Ser Val Ser lie Ser Asn 

275 280 285 

Asp Val Asn Thr His Asn Glu Gly Ser Ser Val Lys Lys Met His 

290 295 300 



<210> 225 
<211> 1146 
<212> DNA 

<213> Saccharomyces sp. 
<400> 225 

atgtctttta gggatgtcct agaaagagga gatgaatttt tagaagccta tcccagaaga 60 
agcccccttt ggagatttct ttcatacagt acatcattac tgaccttcgg tgtatcaaaa 120 
etgcttcttt tcacatgcta taatgtcaaa ttgaatggtt ttgaaaaatt agaaactgcc 180 
ttggaacgtt ccaaaaggga aaatagaggc cttatgacgg tcatgaacca tatgagtatg 240 
gtcgatgatc cgttagtttg ggcaacacta ccatataagt tatttacgtc tttggacaac 300 
ataagatggt ctttgggtgc acataatatt tgctttcaaa ataaatttct ggccaacttt 360 
ttctcacttg gccaagtcct ttcaacagaa agatttgggg tgggcccatt tcaaggttct 420 
atagatgctt caataagatt gttaagccct gacgacactt tagacttgga atggacccct 480 
cactctgagg tctcttcttc gctaaaaaaa gcctactccc cgcccataat aaggtcgaag 540 
ccatcttggg tccatgttta tccagaagga tttgtactac aattatatcc gccttttgaa 600 
aattcgatga ggtattttaa atggggtatt accagaatga tcctagaagc aacaaagccg 660 
cccattgtag taccaatatt tgctacaggg tttgaaaaaa tagcatccga agcagtcaca 720 
gattcaatgt ttagacaaat tctaccaaga aactttggct ctgaaataaa tgttaccata 780 
ggggatcctt taaatgatga tttaatcgac aggtatagaa aagaatggac acatttggtt 840 
gaaaaatact atgatcccaa aaatcctaac gacctctctg acgaattgaa atatggtaaa 900 
gaggcgcaag atttaagaag cagattagcc gctgaactga gagcccatgt tgctgaaatt 960 
agaaatgaag ttcgcaaatt accacgcgaa gaccctaggt tcaaatcccc ctcatggtgg 
1020 

aagcggttca acaccacgga aggtaaatcg gacccagatg ttaaagtcat tggcgaaaat 
1080 

tgggcaataa ggaggatgca aaagtttctg cctccagagg gtaaaccaaa gggtaaggat 

1140 

gattga 

1146 

<210> 226 
<211> 1191 
<212> DNA 

<213> Saccharomyces sp. 
<400> 226 

atgaagcatt cccaaaaata ccgtaggtat ggaatttatg aaaagactgg taatcccttt 60 
ataaaagggt tgcaaaggct gcttatcgct tgcttgttca tttcaggctc gctgagtatt 120 
gtcgtttttc agatctgtct acaggtgctt ctcccttgga gcaagattag atttcaaaat 180 
ggtataaatc aaagtaagaa ggcttttatc gttttattat gcatgatctt gaacatggtg 240 
gctccctctt ctttgaatgt cacttttgaa acatcgcggc cattgaagaa ctcttctaac 300 
gccaagccat gctttagatt taaagacagg gctataataa ttgcaaatca tcaaatgtat 360 
gcagactgga tttatctctg gtggctttcc tttgtttcaa atttgggtgg taacgtttat 420 
atcatcctga agaaagctct gcagtacata ccattactgg gatttggcat gcgaaatttt 480 
aagtttatat ttttaagtag gaactggcaa aaggatgaga aagctttaac aaatagtttg 540 
gtttctatgg acttaaacge gaggtgcaag gggcccctta caaattataa gagttgttat 600 
tccaagacaa atgaatccat tgccgcttat aatttaatca tgttccctga gggtacaaat 660 
ctaagcctca agacaagaga aaaaagcgag gcattctgtc aaagagcaca tttggaccat 720 
gtccaattaa gacatttgtt attaccgcac tctaaaggct tgaagtttgc agtagaaaaa 780 
ctagctccta gtttagatgc tatctacgat gtcactattg gatattctcc cgccttgaga 840 
acggaatacg tcggcaccaa attcaccttg aagaaaatat tcttaatggg tgtctatccg 900 
gagaaagtag atttttatat tagggaattt agagttaatg agatcccttt gcaagatgac 960 
gaagtttttt tcaattggtt actgggcgtg tggaaagaaa aagatcaact gctagaagac 
1020 

tactacaaca caggccaatt taaaagtaat gctaaaaatg acaaccaatc catcgttgtt 
1080 

acgacacaaa cgactggatt tcagcacgaa acattgacac cccgtatcct ttcatattac 
1140 

gggttcttcg cttttcttat tcttgtattt gtgatgaaaa aaaatcattg a 
1191 
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<210> 227 
<211> 1440 
<212> DNA 

<213> Saccharomyces sp. 
<400> 227 

atgggttttg ttgatttctt cgaaacatat 
ttagatattt ctgattggtt gagtctgacc 
taccttcatt ctttttttac tgcaatcaat 
ttctgtctta gactgcattt actatatgac 
gagtacaaaa ttcggctgct ctcgagggca 
actttagaca aggtgctgga ggcgattgaa 
accttcgaaa aaaaaaaaaa cgtccaaatt 
ttttttaaag atagcaaatt ccaaaccaca 
gactacacat tgattaatta cctttttctc 
tgggaatttc tacaaaagct gaggaagggg 
tttcttggtt ggggaaaaat gtttaacttt 
ttcaaagatg aaacactcgc actctcatcg 
aacaatcaag ctattactat ttttcccgaa 
attcaaagaa aattacacca agattttccc 
ccaagattta aaaactttac cactttgatg 
agaaagaaaa accgtaacaa tataatcaaa 
gacaaattag ttcacaagag catgaaaatg 
1020 

ccgcccatga tcgtagataa ttcatactta 
1080 

aagcccaagg tggtacgaat caatccatac 
1140 

gtcaaatata ctgatagtgg gcatgatcat 
1200 

tatcaattag agcaaatatc tccgacaatc 
1260 

gaaaacaaca taaaggataa ggaccccatt 
1320 

attcaaccat tactcgcata caatgatgag 
1380 

atagaaaaag atagattaat cgagtccttg 
1440 

<210> 228 
<211> 903 
<212> DNA 

<213> Saccharomyces sp. 
<400> 228 

atggaaaagt acaccaattg gagagacaat 
acaatcagga aacctagtaa ggtgatgaca 
accattataa tgctaccatt gattatgctg 
ggtttgatat tgaagtttac attcagttgg 
aaacgtgacg taaggaaatc caagcattat 
tgtacctcac ctttagatgc tttttcagtg 
ttggtcccat ccaatgacat tgtatacaaa 
ctcgccggtg ggttagatat aaaactctat 
ggcaataccg tgaattttat gtttgctgag 
ccgtttagta taaccgggaa aaaacttaaa 
aaccccgcaa tggccaaaac taaaaaattt 
aaaactgcca tcaccacatt gcccatctcc 
aagggcatta atgttaaatg caagatcaac 
gaattacgcg ttgcattaaa cggtggcgac 
gttgaatcta agaggaattt tgtgaaggaa 
tag 

<210> 229 
<211> 2280 
<212> DNA 

<213> Saccharomyces sp. 
<400> 229 

atgcctgcac caaaactcac ggagaaattt 
aattacagtt ccatcgaggc caaaagcgtc 



atggtcggtt ctagggtcca gttcaaacag 60 
ccaaggttgc ttattctttt tggctatttt 120 
caattcctac agttcattaa cacgaattcc 180 
agattttggt cgcatgtgcc cataataggt 240 
ctgacatata gtaaactgaa aataatacca 300 
atttggtttc agctacattt agttgaaatg 360 
ttcataaccg agggaagtga tgacctaaac 420 
ttaatgatat gtaatcatcg atcagtgaat 480 
aaaagttgtc ccaccaagtt ttatactaaa 540 
gaagatctag ctgaatggcc tcagttaaaa 600 
cctcgattgg atctactaaa gaacatattc 660 
aatgagttaa gagatatttt agaaagacaa 720 
gtcaatatca tgagtttgga actatcaatt 780 
tttgttataa acttctataa tttattatac 840 
gctgcttttt catcaattaa aaacatcaaa 900 
gaggcccgat acctgtttca cagagaactt 960 
gagtcttcca aggtatccga taagacgacg 

cttacaaaaa aggaagaaat cagcagcggc 

atatatgatg tcaccataat ttattaccga 

accaacggag atttgagact tcataaaggt 

tttgagatga ttcaaccaga aatggagtct 

gttgtgatgg taaatgtaaa aaagcatcaa 

agtttagaaa agtggcttga aaataggtgg 

caaaaaaata ttaaaattga gaccaaataa 



ggtacgggaa tagctccatt tctaccaaac 60 
gcgtgtttgt tgggtatcct aggggtgaaa 120 
taccttctaa ctggccagaa caacttactg 180 
aaagaggaaa ttaccgtgca aggaatcaag 240 
ccacagaagg gcaagcttta tatttgcaat 300 
gtgttattag ctcaagggcc tgttacgttg 360 
gtttccataa gagaattcat caacttcatc 420 
ggccacgagg tagcagagct atctcaattg 480 
ggtacctcat gtaatggtaa aagcgtctta 540 
gaattcatag acccttcaat aaccacaatg 600 
gaattgcaga ccatccaaat caaaactaat 660 
aatatggagt atttatctag atttctgaac 720 
gagccacaag tactctcgga taatttagag 780 
aaatataaac tagtctcacg gaagttagat 840 
tatatcagcg atcaacgtaa aaagaggaag 900 

903 



gcctcttcca agagcacaca gaaaactacg 60 
aagacgtcgg ctgatcaggc atacatctac 120 
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caagagccta gcgctaccaa gaagatactt tactccatcg ccacatggct gttgtacaac 180 
atcttccact gcttctttag agaaatcaga ggccggggca gtttcaaggt accgcaacag 240 
ggaccggtga tctttgttgc ggctccgcat gctaaccagt tcgtcgaccc tgtaatcctt 300 
atgggcgagg tgaagaaatc tgtcaacaga cgtgtgtcct tcttgattgc ggagagctca 360 
ttaaagcaac cccccatagg gtttttggct agtttcttca tggccatagg cgtggtaagg 420 
ccgcaggata atttgaaacc ggcagaaggt actatccgcg tagatccaac agactacaag 480 
agagttatcg gccacgacac gcatttcttg actgattgta tgccaaaggg tctcatcggg 540 
ttacccaaat caatgggatt tggagaaatc cagtccatag aaagtgacac gagtttgacc 600 
ctaagaaaag agttcaaaat ggccaaacca gagattaaaa ctgctttact caccggcact 660 
acttataaat atgccgctaa agtcgaccaa tcttgcgttt accatagagt ttttgagcat 720 
ttggcccata acaactgcat tgggatcttt cctgaaggtg ggtcccacga cagaacaaac 780 
ttgttgcccc tgaaagcagg tgtggcgatt atggctcttg gttgcatgga taagcatcct 840 
gacgtcaatg ttaagattgt tccctgcggt atgaattatt tccatccaca taagttcagg 900 
tcgagagcgg ttgttgaatt cggtgacccc attgaaatac cgaaggaact agtcgccaag 960 
taccacaacc cggaaacgaa cagagatgca gtgaaagaat tattagatac catatcgaag 
1020 

ggtttacaat ccgttaccgt tacatgttct gattatgaaa ctttgatggt ggttcaaacg 
1080 

ataagaagac tatatatgac acaatttagc accaagttac cgttgccctt gattgtggaa 
1140 

atgaacagaa gaatggtcaa aggttacgaa ttctatagaa acgatcctaa aatagcggac 
1200 

ttgaccaaag atataatggc atataatgcc gccttgagac actataatct tcctgatcac 
1260 

cttgtggagg aggcaaaggt aaatttcgca aaaaacctcg gacttgtttt ttttagatcc 
1320 

atcgggctct gcatcctctt ttcgttagcc atgccaggta tcattatgtt ctcacctgtc 
1380 

ttcatattag ccaagagaat ttctcaagaa aaggcccgta ccgctttgtc caagtctaca 
1440 

gttaaaataa aggctaacga tgtcattgcc acgtggaaaa tcttgattgg gatgggattt 
1500 

gcgcccttgc tttacatctt ttggtccgtt ttaatcactt attacctcag acataaacca 
1560 

tggaataaaa tatatgtttt ttccgggtct tacatctcgt gtgttatagt cacgtattcc 
1620 

gccttaatcg tgggtgatat tggtatggat ggtttcaaat ctttgagacc actggtttta 
1680 

tctcttacat ctccaaaggg cttgcaaaag ctacaaaagg atcgtagaaa tctggcagaa 
1740 

agaataatcg aagttgtaaa taactttgga agcgaattat tccccgattt cgatagtgcc 
1800 

gccctacgtg aagaattcga cgtcatcgat gaagaggaag aagatcgaaa aacctcagaa * 
1860 

ttgaatcgca ggaaaatgct aagaaaacag aaaataaaaa gacaagaaaa agattcgtca 
1920 

tcacctatca tcagccaacg tgacaaccac gatgcctatg aacaccataa ccaagattcc 
1980 

gatggcgtct cattggtcaa tagtgacaat tccctctcta acattccatt attctcttct 
2040 

acttttcatc gtaagtcaga gtcttcctta gcttcgacat ccgttgcacc ttcttcttcc 
2100 

tccgaatttg aggtagaaaa cgaaatcttg gaggaaaaaa atggattagc aagtaaaatc 
2160 

gcacaggccg tcttaaacaa gagaattggt gaaaatactg ccagggaaga ggaagaggaa 
2220 

gaagaagagg aagaagaaga agaggaagaa gaagaagaag ggaaagaagg agatgcgtag 
2280 

<210> 230 
<211> 2232 
<212> DNA 

<213> Saccharomyces sp. 
<400> 230 

atgtctgctc ccgctgccga tcataacgct 
tcccgacggt acaaaaattc atacaatgga 
gatgtgtctg tatttctgtt taatattttg 
cgtggtgcat ataacgttcc cgaagttggg 
gcaaatcagt tcatcgaccc ggctttggta 



gccaaaccta ttcctcatgt acctcaagcg 60 
ttcgtataca atatacatac atggctgtat 120 
ttcactattt tcttcagaga aattaaggta 180 
gtgccaacca tccttgtgtg tgcccctcat 240 
atgtcgcaaa cccgtttgct gaagacatca 300 



WO 00/18889 



69 



PCT/US99/22231 



gcgggaaagt cccgatccag aatgccttgt tttgttactg ctgagtcgag ttttaagaaa 360 
agatttatct ctttctttgg tcacgcaatg ggcggtattc ccgtgcctag aattcaggac 420 
aacttgaagc cagtggatga gaatcttgag atttacgctc cggacttgaa gaaccacccg 480 
gaaatcatca agggccgctc caagaaccca cagactacac cagtgaactt tacgaaaagg 540 
ttttctgcca agtccttgct tggattgccc gactacttaa gtaatgctca aatcaaggaa 600 
atcccggatg atgaaacgat aatcttgtcc tctccattca gaacatcgaa atcaaaagtg 660 
gtggagctct tgactaatgg tactaatttt aaatatgcag agaaaatcga caatacggaa 720 
actttccaga gtgtttttga tcacttgcat acgaagggct gtgtaggtat tttccccgag 780 
ggtggttctc atgaccgtcc ttcgttacta cccatcaagg caggtgttgc cattatggct 840 
ctgggcgcag tagccgctga tcctaccatg aaagttgctg ttgtaccctg tggtttgcat 900 
tatttccaca gaaataaatt cagatctaga gctgttttag aatacggcga acctatagtg 960 
gtggatggga aatatggcga aatgtataag gactccccac gtgagaccgt ttccaaacta 
1020 

ctaaaaaaga tcaccaattc tttgttttct gttaccgaaa atgctccaga ttacgatact 
1080 

ttgatggtca ttcaggctgc cagaagacta tatcaaccgg taaaagtcag gctacctttg 
1140 

cctgccattg tagaaatcaa cagaaggtta cttttcggtt attccaagtt taaagatgat 
1200 

ccaagaatta ttcacttaaa aaaactggta tatgactaca acaggaaatt agattcagtg 
1260 

ggtttaaaag accatcaggt gatgcaatta aaaactacca aattagaagc attgaggtgc 
1320 

tttgtaactt tgatcgttcg attgattaaa ttttctgtct ttgctatact atcgttaccg 
1380 

ggttctattc tcttcactcc aattttcatt atttgtcgcg tatactcaga aaagaaggcc 
1440 

aaagagggtt taaagaaatc attggttaaa attaagggta ccgatttgtt ggccacatgg 
1500 

aaacttatcg tggcgttaat attggcacca attttatacg ttacttactc gatcttgttg 
1560 

attattttgg caagaaaaca acactattgt cgcatctggg ttccttccaa taacgcattc 
1620 

atacaatttg tctattttta tgcgttattg gttttcacca cgtattcctc tttaaagacc 
1680 

ggtgaaatcg gtgttgacct tttcaaatct ttaagaccac tttttgtttc tattgtttac 
1740 

cccggtaaga agatcgaaga aatccaaaca acaagaaaga atttaagtct agagttgact 
1800 

gctgtttgta acgatttagg acctttggtt ttccctgatt acgataaatt agcgactgag 
1860 

atattctcta agagagacgg ttatgatgtc tcttctgatg cagagtcttc tataagtcgt 
1920 

atgagtgtac aatctagaag ccgctcttct tctatacatt ctattggctc gctagcttct 
1980 

aacgccctat caagagtgaa ttcaagaggc tcgttgaccg atattccaat tttttctgat 
2040 

gcaaagcaag gtcaatggaa aagtgaaggt gaaactagtg aggatgagga tgaatttgat 
2100 

gagaaaaatc ctgccatagt acaaaccgca cgaagttctg atctaaataa ggaaaacagt 
2160 

cgcaacacaa atatatcttc gaagattgct tcgctggtaa gacagaaaag agaacacgaa 
2220 

aagaaagaat ga 
2232 

<210> 231 
<211> 1194 
<212> DNA 

<213> Saccharomyces sp. 
<400> 231 

atgctgcatc aaaaaatagc tcataaagtt cgaaaagtcg tcgtcccagg tatttcctta 60 
ttgattttct tccagggatg ccttattctt ttgtttctcc aactcaccta taagactctt 120 
tactgtagaa atgatataag gaaacaaatt ggtctcaata aaaccaaaag attatttatt 180 
gtcttggtat catccatttt gcatgttgtc gcaccatctg cagtgagaat taccactgaa 240 
aattccagtg ttcctaaagg tacttttttt ttagacttga agaagaaaag gattctttct 300 
catctaaagt ccaattcggt ggccatttgc aatcaccaaa tatacacgga ttggatattt 360 
ttatggtggt tggcttacac atcgaactta ggggctaatg tcttcattat tttaaaaaaa 420 
tcgttggctt ccattcctat cctcggtttc ggtatgagaa actataattt catttttatg 480 
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agtagaaagt gggcacaaga caaaataacc ctaagcaaca gccttgctgg ccttgattcg 540 
aatgcaaggg gcgccggctc acttgctgga aagtcacctg agcgcataac tgaggaagga 600 
gagagcatat ggaatccgga ggttattgat ccaaaacaaa tccattggcc atacaatctt 660 
atcctattcc ctgaaggtac aaatctcagt gctgatacta ggcaaaaaag tgctaaatat 720 
gctgccaaaa taggcaaaaa gccattcaag aatgtgctac tgcctcattc tacaggccta 780 
agatactcgt tacaaaagtt gaagccaagt attgaaagtc tttatgatat tacgatcggc 840 
tactccggtg taaaacagga ggaatatggt gagcttatat atgggctgaa gagcatattt 900 
ttagaaggaa aatacccgaa gttagtcgat attcacatca gagcatttga tgttaaagat 960 
attccattag aggacgagaa tgaattttca gaatggctgt ataaaatttg gagtgagaag 
1020 

gatgctctaa tggaaaggta ctattccact ggatcattcg taagtgatcc tgaaacaaac 
1080 

cattcagtta ccgatagttt caagatcaat cgtattgagt taactgaagt gctaatatta 
1140 

ccaactctaa caataatttg gttagtttat aaactttatt gttttatttt ttga 
1194 

<210> 232 
<211> 912 
<212> DNA 

<213> Saccharomyces sp. 
<400> 232 

atgagtgtga taggtaggtt cttgtattac ttgaggtccg tgttggtcgt actggcgctt 60 
gcaggctgtg gcttttacgg tgtaatcgcc tctatccttt gcacgttaat cggtaagcaa 120 
catttggctc agtggattac tgcgcgttgt ttttaccatg tcatgaaatt gatgcttggc 180 
cttgacgtca aggtcgttgg cgaggagaat ttggccaaga agccatatat tatgattgcc 240 
aatcaccaat ccaccttgga tatcttcatg ttaggtagga ttttcccccc tggttgcaca 300 
gttactgcca agaagtcttt gaaatacgtc ccctttctgg gttggttcat ggctttgagt 360 
ggtacatatt tcttagacag atctaaaagg caagaagcca ttgacacctt gaataaaggt 420 
ttagaaaatg ttaagaaaaa caagcgtgct ctatgggttt ttcctgaggg taccaggtct 480 
tacacgagtg agctgacaat gttgcctttc aagaagggtg ctttccattt ggcacaacag 540 
ggtaagatcc ccattgttcc agtggttgtt tccaatacca gtactttagt aagtcctaaa 600 
tatggggtct tcaacagagg ctgtatgatt gttagaattt taaaacctat ttcaaccgag 660 
aacttaacaa aggacaaaat tggtgaattt gctgaaaaag ttagagatca aatggttgac 720 
actttgaagg agattggcta ctctcccgcc atcaacgata caaccctccc accacaagct 780 
attgagtatg ccgctcttca acatgacaag aaagtgaaca agaaaatcaa gaatgagcct 840 
gtgccttctg tcagcattag caacgatgtc aatacccata acgaaggttc atctgtaaaa 900 
aagatgcatt aa 912 

<210> 233 
<211> 54 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence .'Synthetic 
Oligonucleotide 

<400> 233 

cgcgatttaa atggcgcgcc ctgcaggcgg ccgcctgcag ggcgcgccat ttaa 54 

<210> 234 
<211> 32 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 



<400> 234 

tcgaggatcc gcggccgcaa gcttcctgca gg 

<210> 235 
<211> 32 
<212> DNA 

<213> Artificial Sequence 



32 



<220> 
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<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 235 ' 

tcgacctgca ggaagcttgc ggccgcggat cc 

<210> 236 
<211> 32 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 236 

tcgacctgca ggaagcttgc ggccgcggat cc 

<210> 237 
<211> 32 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 237 

tcgaggatcc gcggccgcaa gcttcctgca gg 

<210> 238 
<211> 36 
<212> DNA 

<213> Artificial Sequence 



<223> Description of Artificial Sequence : Synthetic 
01 igonucleot ide 

<400> 238 

tcgaggatcc gcggccgcaa gcttcctgca ggagct 

<210> 239 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 239 

cctgcaggaa gcttgcggcc gcggatcc 

<210> 240 
<211> 36 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 240 

tcgacctgca ggaagcttgc ggccgcggat ccagct 

<210> 241 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 241 

ggatccgcgg ccgcaagctt cctgcagg 28 



